Wow. That's an impressive reply.

On 08.01.2016 20:26, srinivas devaki wrote:
So you need a task scheduler with expiration and priority on each task.
Interesting Problem..

as peter said about marking the task in heapB to be deleted. but this
needs searching everytime you pop off an old item [complexity:
O(len(heapB))]. you may as well use python del on it as complexity is
same.
But if you already know the where to look in the heapB then you can
avoid searching and thereby reducing the complexity. you can do this
by saving the references of heapB in heapA and viceversa

and if you marking delete on a number of low priority tasks, then it
can increase your heapB enormously because being low priority items
they can stagnate. to resolve this error you have to perform linear
checking iterations at every critical length (this critical length can
be obtained mathmatically), so that your amortized complexity of push,
pop remains log(number_of_valid_tasks_at_any_time);
the number of valid tasks at any time are those tasks which are not
expired at the time.

My Algorithm:
version_a: https://gist.github.com/9ce7a0e534c6e768239e
this version has simple scheduler which has methods for popping high
priority one and popping oldest task.
But this version has a disadvantage, i.e If lets say you are pushed
some 10**5 tasks with priority 2, and then pushed some 10**5 tasks
with priority 1. and then you decided to pop the oldest 10**5 items.
in this version that 10**5 elements will stagnate in priority heap if
in future all priorities are less than 1.
now this is not a big issue but if you are running a webserver and
over a span of 5 days there could be 10**10 tasks, and even if half of
those tasks stagnate your server is going to crash with out of memory.

version_b: https://gist.github.com/99b4d590753ba234eeed
this version resolved that stagnation. this one will run sweeps
whenever there are more than half of useless items, thereby giving us
an amortized complexity of O(log(n)) for push, pop, etc

but this version doesn't have the feature of expiration.

version_c: https://gist.github.com/9dfd0d291672c0ffa5c3
in this one we simply keep a variable expiration, and relieve the
expired tasks on any operation. i have coded it in such a way that
expiration is specific time, you can change it to delta time if you
want to.

Time Complexity: O(log(n)) insertion, O(log(n)) deletion   [amortized]
Space Complexity: O(n)  [amortized]
here n is number of valid items i.e which are not expired.

I hope this helps with your problem

Indeed. I already do the sweep method as you suggested. ;)

Additionally, you provided me with a reasonable condition when to do the sweep in order to achieve O(log n). Thanks much for that. I currently used a time-bases approached (sweep each 20 iterations).

PS: Could you add a note on how you got to the condition ( 2*self.useless_b > len(self.heap_b))?

PS:
sorry for posting links, it's just that the code is large for email.
I'm using minimum number has highest priority convention.

I like Web technology, so no problem here. :)

On Fri, Jan 8, 2016 at 10:15 PM, Sven R. Kunze <srku...@mail.de> wrote:
Thanks for your suggestion.

On 08.01.2016 14:21, srinivas devaki wrote:
You can create a single heap with primary key as timestamp and
secondary key as priority, i.e by creating a tuple
insert the elements into the heap as
(timestamp, priority)
I think I cannot use that because I need the list sorted by both criteria.
If there is any underlying meaning for creating 2 heaps. please mention.

I use two heaps because I need to insert arbitrary items fast and remove the
ones fast which are too old (timestamp) or are next in order (priority).

Basically a task scheduler where tasks can be thrown away once they are too
long in the queue.


On Fri, Jan 8, 2016 at 4:22 AM, Sven R. Kunze <srku...@mail.de> wrote:
Hi everybody,

suppose, I need items sorted by two criteria (say timestamp and
priority).
For that purpose, I use two heaps (heapq module):

heapA # items sorted by timestamp
heapB # items sorted by priority

Now my actual problem. When popping an item of heapA (that's the oldest
item), I need to remove the very same item from heapB, regardlessly where
it
is in heapB. And vice versa.

Is there a datastructure or a simple trick to achieve that in an
efficient
matter?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list


--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to