hi all,
our long time interim fearless leader will not be leading the meeting
this coming tuesday. he left the controls in my safe hands. there will
be two main talks, a short one by me on new features and planned stuff
in File::Slurp and also we will watch a video of schwern on perl5i which
is
On Sat, 05 Feb 2011 18:27:13 -0500 Charlie wrote:
C> The sample program below runs in 00:09:04 on 1.15GB (1024 copies of
C> Moby Dick). Replacing the hard-coded map with 2 entries with 6000
C> words taken from the text (randomly selected, unique, >5 chars) runs
C> in 00:09:17. I.e. the
Short answer, no, Perl regex will not build an optimal lookup of a token
into your set of 6000 names. In general, if speed is the issue, do not use
regex. It does not scale.
Also, be clear on the 2 problems at hand: 1) tokenizing 1GB of input text
and 2) adding a prefix to identified
On Sat, Feb 5, 2011 at 3:43 PM, Alex Vandiver wrote:
> Since we're talking about literals, this hasn't been true since 2007,
> with the release of perl 5.10. Perl now uses a Aho-Corasick trie
> algorithm internally for literal alternations, which allows for
> matching without backtracking:
Aha,
At Fri Feb 04 18:53:09 -0500 2011, Uri Guttman wrote:
> that will kill your cpu. alternations are very slow since they have to
> go back and try from the beginning of the list each time.
Since we're talking about literals, this hasn't been true since 2007,
with the release of perl 5.10. Perl now
> "MP" == Martyn Peck writes:
MP> What's wrong with something like this:
MP> while($line=<>){
MP> foreach my $name (@names){
MP> $line ~= s/$name/prefix_$1/g;
MP> }
MP> }
it is O( N^2 ) which is very slow for large data sets.
MP> I know it seems
> "RJK" == Ronald J Kimball writes:
RJK> On Fri, Feb 04, 2011 at 11:26:05PM -0500, Uri Guttman wrote:
>> > "CW" == Conor Walsh writes:
>>
CW> I suppose I have a new valid answer to my favorite "do you really know
CW> Perl or do you just know the syntax" interview question.
hi
Ok, I've been reading over the responses you've been getting and I just
have to ask everyone.
What's wrong with something like this:
while($line=<>){
foreach my $name (@names){
$line ~= s/$name/prefix_$1/g;
}
}
I know it seems kind of
On Saturday 05 Feb 2011 00:23:50 Conor Walsh wrote:
> On 2/4/2011 2:04 PM, Asa Martin wrote:
> > I was told that "predeclaring" the variables outside the loop saved on
> > memory allocation, and that using @rules instead of four named variables
> > was also more efficient. I had never considered
9 matches
Mail list logo