Is there something wrong with this algo ??
####################################################
my @start = (3,4, 15, 23, 29, 34, 37);
my @stop = (5,10,20, 29, 33, 36, 37);
# Assuming that the start and the stop array have the
# same number of elements and each start corresponds with
# each stop
my $i = 0;
my $hash{$start[0]} = $stop[0];
my $refStart = $start[0];
my $refStop = $stop[0];
for ($i = 1; $i <= $#start; $i++) {
if ($refStop >= $start[$i]) {
$refStop = $stop[$i];
$hash{$refStart} = $stop[$i];
} else {
$refStart = $start[$i];
$refStop = $stop[$i];
$hash{$refStart} = $stop[$i];
}
}
foreach (sort { $a <=> $b } keys %hash) {
print $_, " ", $hash{$_}, "\n";
}
######################################################
-----Original Message-----
From: Jonathan E. Paton [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 31, 2002 1:58 PM
To: [EMAIL PROTECTED]
Subject: RE: union of times algorithm
--- [EMAIL PROTECTED] wrote:
> What we have is as stated a union of times. So all I do is take the
> timeslices and generate a hash element for each time. I don't check to see
> if it is already there, but just set to one. Then I sort the hash down
> numerically and total where current minus previous equal 1. Here is a shot:
>
> use Data::Dumper;
>
> my @timeslices = (
> [4, 9],
> [3, 6],
> [2, 4],
> [11,12],
> [14,14],
> [17,19]
> );
>
> my %MyHashCounts = ();
>
> for(my $MyId=0;$MyId<=$#timeslices;$MyId++){
> for($MyId1=$timeslices[$MyId][0];$MyId1<=$timeslices[$MyId][1];$MyId1++)
> {
> $MyHashCounts{$MyId1}=1;
> }
> }
> print Dumper(\@timeslices);
> print Dumper(\%MyHashCounts);
>
> my $MyPrev = -10;
> my $MyTotal = 0;
> foreach my $MyKeys (sort {$a <=> $b} keys %MyHashCounts) {
> printf "%3d ", $MyKeys;
> $MyTotal++ if ( ($MyKeys - $MyPrev) == 1 );
> $MyPrev = $MyKeys;
> }
> printf "\n";
> printf "Total: %4d\n", $MyTotal;
>
> Output:
> 2 3 4 5 6 7 8 9 11 12 14 17 18 19
> Total: 10
Which is wrong, since the 14 sneaked in even though the
contribution of [14,14] should be zero. Notice:
0 1 2
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
> | [4, | 9], | |
> | [3, 6], | | | | |
> |[2,| 4], | | | | | | |
> | | | | | | | | | |[11,
> | | | | | | | | | | | 12],
> | | | | | | | | | | | | |[14,
> | | | | | | | | | | | | | 14],
> | | | | | | | | | | | | | | | |[17,
| | | | | | | | | | | | | | | | | |19]
X X X X X X X X X X X = 11
Looking at my algorithm, I can remove the use of the
hash.
The algorithm you are using is inferiour to the approach
I tried (even though BOTH our algorithms are wrong). If
I start using numbers in the region of Unix timestamps,
(i.e. large) it is an impossible to satisfy memory/CPU hog.
The algorithm I use is pictorally:
XXXXXXXXXXXXXXXXXXX
YYYYYYYYYYYYYYYYYYYYYYY
ZZZZZZZZZZZZZZZZZZZZZZ
^ ^ ^
start X ^ start Y end Z ^
start Z end Y
^
end X
Sorting results in the following events:
^start X
^start Z
^end X
^start Y
^end Z
^ end Y
And we can find the number of overlapping sections now:
ACTIVE
V
1 ^start X
12 ^start Z
12 ^end X
23 ^start Y
2 ^end Z
3 ^ end Y
So, by sorting we can loop through, starting and stopping "active"
timeslots. We need to do this in order, otherwise we will get
incorrect results. I sort the second column to always get a start
event before a stop event.
Jonathan Paton
=====
$_=q|.,&@$$. ,.@$&@$. .&$$@. ,,$ ....!$_=$p.'&$@.',y'&$@' ..,';for(/\S+/g){
!|.q| .$ .,@, ,$, .,.. @, ,$ ,,@ ..,,.!++$.<22?${'y'.$_}=chr$.+64:[$$=${'y'
!|.q| ,@$@&.,. $$$&, ..@&&$,,, $., ...!.$_},$y.=($.=~/22\|26\|3(3\|7)/x?' '
!|.q|. @ ., ,.&,,, , .$..&. .,$ .,,!.$$:"\l$$")]};$y=~/ (.*)/;warn"$1\n"
!|.q|. $ .,. .,$$&&$...&., @.,.&@$@ ..|,map{-$|--?$r:$p.=$_}split'!';eval$r
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]