statistics of text

2005-11-02 Thread Ing. Branislav Gerzo
Hi all, I have quite interesting work. Example: In txt I have some words (up to 100.000) - words.txt (without line numbers): 1. foo 2. bar 3. foo bar 4. foo bar bar 5. bar foo bar 6. bar bar foo 7. foo foo bar 8. foo bar foo bar 9. foob bar 10.foo bars and so on... Now, I have to find all 2 wor

Re: statistics of text

2005-11-02 Thread Ricardo SIGNES
* "Ing. Branislav Gerzo" <[EMAIL PROTECTED]> [2005-11-02T08:52:39] > I have quite interesting work. Example: I wish /I/ could find this sort of work interesting! Or profitable. > Now, I have to find all 2 words sentences with their sums in the list. > For example for this list it could be (witho

Re: statistics of text

2005-11-02 Thread Ing. Branislav Gerzo
Ricardo SIGNES [RS], on Wednesday, November 2, 2005 at 09:24 (-0500) wrote the following: RS> my %occurances; # we'll put occurances here RS> while (<>) {# get each line of argument files (or sdin) RS> chomp;# eliminate the newline RS> my @wo

Re: statistics of text

2005-11-02 Thread Shawn Corey
Ing. Branislav Gerzo wrote: I did this by hand...but anyone know how to this effectively in perl? I think I have to build hash of all possibilities of 2 words sentences (in input txt are allowed only [0-9a-z ]), in list I will have lines of input txt, and iterate every key in hash over array, wri

Re: statistics of text

2005-11-02 Thread Shawn Corey
Ing. Branislav Gerzo wrote: my $keyword = "business"; my %all2words = (); open TXT, "words.txt" or die $!; while () { chomp; next unless /\b$keyword\b/; # skip loop if no keyword while ( /(?=(\S+\s+\S+))\S+/g ) { my $temp = $1; $all2wo

Re: statistics of text

2005-11-02 Thread John Doe
Ing. Branislav Gerzo am Mittwoch, 2. November 2005 14.52: > Hi all, > > I have quite interesting work. Example: > > In txt I have some words (up to 100.000) - words.txt (without line > numbers): > 1. foo > 2. bar > 3. foo bar > 4. foo bar bar > 5. bar foo bar > 6. bar bar foo > 7. foo foo bar > 8.

Re: statistics of text

2005-11-02 Thread Brano Gerzo
Shawn Corey [SC], on Wednesday, November 02, 2005 at 10:59 (-0500) thoughtfully wrote the following: SC>next unless /\b$keyword\b/; # skip loop if no keyword yes, thats right, but in my case there will be *always* keyword on line :) Also thanks for all replies, nice examples. --

quickly snag all numbers

2005-11-02 Thread Bryan R Harris
I'd like to snag all the whitespace padded numbers in $_ into an array, e.g. in the following: 1 "M_fx,-3,+2.p2" -31.4e-1 4. I'd like to pick up only the 1, -31.4e-1, and 4. I've tried: $ss = qr/(^|\s)[+-]?[0-9]+\.?[0-9]*([eE][+-]?[0-9]+\.?[0-9]*)?($|\s)/; @list = m/$ss/g; ... but it slurps

RE: quickly snag all numbers

2005-11-02 Thread Charles K. Clarkson
Bryan R Harris wrote: : I'd like to snag all the whitespace padded numbers in $_ into an : array, e.g. in the following: : : 1 "M_fx,-3,+2.p2" -31.4e-1 4. Can you give more than one example? If everything fits this form, then don't look for numbers. Just delete

Re: quickly snag all numbers

2005-11-02 Thread Bob Showalter
Bryan R Harris wrote: I'd like to snag all the whitespace padded numbers in $_ into an array, e.g. in the following: 1 "M_fx,-3,+2.p2" -31.4e-1 4. I'd like to pick up only the 1, -31.4e-1, and 4. I've tried: $ss = qr/(^|\s)[+-]?[0-9]+\.?[0-9]*([eE][+-]?[0-9]+\.?[0-9]*)?($|\s)/; @list = m/$ss

Re: quickly snag all numbers

2005-11-02 Thread Jay Savage
On 11/2/05, Bryan R Harris <[EMAIL PROTECTED]> wrote: > > > I'd like to snag all the whitespace padded numbers in $_ into an array, e.g. > in the following: > > 1 "M_fx,-3,+2.p2" -31.4e-1 4. > > I'd like to pick up only the 1, -31.4e-1, and 4. > > I've tried: > > $ss = qr/(^|\s)[+-]?[0-9]+\.?[0-9]*

Re: quickly snag all numbers

2005-11-02 Thread Bryan R Harris
> Bryan R Harris wrote: > : I'd like to snag all the whitespace padded numbers in $_ into an > : array, e.g. in the following: > : > : 1 "M_fx,-3,+2.p2" -31.4e-1 4. > > Can you give more than one example? The script is intended to normalize all the data in a fil

Re: quickly snag all numbers

2005-11-02 Thread Bryan R Harris
> Bryan R Harris wrote: >> >> I'd like to snag all the whitespace padded numbers in $_ into an array, e.g. >> in the following: >> >> 1 "M_fx,-3,+2.p2" -31.4e-1 4. >> >> I'd like to pick up only the 1, -31.4e-1, and 4. >> >> I've tried: >> >> $ss = qr/(^|\s)[+-]?[0-9]+\.?[0-9]*([eE][+-]?[0-

Re: quickly snag all numbers

2005-11-02 Thread Bob Showalter
Bryan R Harris wrote: Bryan R Harris wrote: I'd like to snag all the whitespace padded numbers in $_ into an array, e.g. in the following: 1 "M_fx,-3,+2.p2" -31.4e-1 4. I'd like to pick up only the 1, -31.4e-1, and 4. I've tried: $ss = qr/(^|\s)[+-]?[0-9]+\.?[0-9]*([eE][+-]?[0-9]+\.?[0-9

Re: quickly snag all numbers

2005-11-02 Thread Bryan R Harris
I'd like to snag all the whitespace padded numbers in $_ into an array, e.g. in the following: 1 "M_fx,-3,+2.p2" -31.4e-1 4. I'd like to pick up only the 1, -31.4e-1, and 4. I've tried: $ss = qr/(^|\s)[+-]?[0-9]+\.?[0-9]*([eE][+-]?[0-9]+\.