It would help if your problem statement was more concrete. however, in
my vague understanding of the problem, it seems like event sourcing
would be an appropriate way to model your business logic:
http://martinfowler.com/eaaDev/EventSourcing.html
if that is the case, then i think using akka persistence and cluster
sharding would be a good starting point. your 'state changes' sound
like Persistent Actors. the problem of knowing whether messages have
been processed or not sounds like it would be solved with persistent
actor recovery
(http://doc.akka.io/docs/akka/2.4.4/scala/persistence.html#Recovery).
-Michael
On 05/09/16 09:56, kraythe wrote:
Thats the thing, if there were humans with inboxes I could have a
staff call them on the phone and check. :) Reprocessing the messages
is a pretty simple solution IF the messages were small in number. When
you get to the point where there are literally millions of events the
problem gets a bit more difficult to manage. If there are 10 million
messages to process and the messages could take 10 minutes to process,
if I check again 1 minute later and 8 million of the records still
show unprocessed and then I add those 8 million back to the queue, now
I have 16 million more messages to process. Then the next phase, 6
million, added to the queue -2 million processed, the is now 20
million messages and so on. by the time I am done with the original
set, Ill have another 30 million messages to process, all of which are
a waste of computing power because they do nothing. Clearly that I
would like to avoid. Also setting the time to be for sure how long we
need to process the first 10 million is not an option because the time
and the number of messages are both variables that are unknown.
Right now I put the messages that need to be processed in a map with a
key and the process that runs every minute checks for messages not
processed. Then it compares those ids against those in the map, if
they are in the map it doesn't resubmit them. However, this doesn't
seem to be a very Akkaesque solution to the problem. I am looking for
ideas on how to handle it without using the map but it may be that I
have to continue using the map to load the message queues.
On Monday, May 9, 2016 at 2:33:54 AM UTC-5, √ wrote:
I'm quite sure that inspecting the mbox will be costlier than
reprocessing at those sizes.
Come up with two different solutions that you could perform
between humans having mailboxes. Pick the best of those.
--
Cheers,
√
On May 8, 2016 5:15 PM, "kraythe" <kra...@gmail.com <javascript:>>
wrote:
I have a process that has to manage a large amount of data. I
want to make the process reactive but it has to process every
data element. The data is stored in Hazelcast in a map (which
is backed by a database but that detail is irrelevant) and the
data is stageful. At each state change something has to be
done either to the data or related data. So if we go from
State A to State B we might have to do something to another
object in the process in a transactional manner. When the data
is in state A a process finds the data and submits it to a
map. Right now I have another thread reading from the map on
intervals that are timed and if there is data in the map it
processes the next entry in the map and so on.
I would like to turn this process into an Akka actor process
but the main stumbling block is to know what is already in the
queue. Say I have 1m objects to process. At each interval the
objects are checked if they can change state and if they can
then they are put in the map to process. The problem is there
could be a ton of these objects and they might take longer to
process than the check interval. Furthermore, although it
would not be damaging to the data it would be immensely
wasteful to put them into the queue to process twice. Finally
if the server crashed or something happened I would want to
put them back into the queue if they are still in state A and
should move to state B. Right now I can get the key set of the
map and not submit them to the process if they are already in
the map. If, instead, I change the system to Akka, then that
ability changes. Whenever an object needs to change state, I
would put it in a message inbox to an actor to process but I
have no way to know what is already in that inbox so it makes
the processing of the messages less durable. If a transaction
fails or node fails I won't know that certain objects need to
be processed again. Right now I can search the store, get the
objects to process, remove all the ones in the queue and then
put only the missing ones in the queue. I don't know how I
could architect this with akka.
What I would be looking for is some means to inspect an inbox
to know if a message has already been enqueued and should not
be enqueued again.
Any suggestions on how I could architect a solution to this
problem?
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
<http://doc.akka.io/docs/akka/current/additional/faq.html>
>>>>>>>>>> Search the archives:
https://groups.google.com/group/akka-user
<https://groups.google.com/group/akka-user>
---
You received this message because you are subscribed to the
Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to akka-user+...@googlegroups.com <javascript:>.
To post to this group, send email to akka...@googlegroups.com
<javascript:>.
Visit this group at https://groups.google.com/group/akka-user
<https://groups.google.com/group/akka-user>.
For more options, visit https://groups.google.com/d/optout
<https://groups.google.com/d/optout>.
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google
Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to akka-user+unsubscr...@googlegroups.com
<mailto:akka-user+unsubscr...@googlegroups.com>.
To post to this group, send email to akka-user@googlegroups.com
<mailto:akka-user@googlegroups.com>.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
--
Read the docs: http://akka.io/docs/
Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.