A ScheduledThreadPool doesn't parallelize or partition your work, it
schedules tasks and keeps a pool of Thread objects it can reuse for that
purpose. If you need a job to be broken into smaller pieces, executed on a
schedule, you'll need to implement some sort of coordination. There's some
prior art in frameworks like Quartz that assist with tracking individual
tasks across hosts or restarts of an app, and things like Spark that are
designed for coordinating subdivided tasks and combining results.

As far as I know there's no magic tool that knows how to reliably subdivide
your task.

On Wed, Jan 2, 2019 at 12:47 PM <lawrence.krub...@gmail.com> wrote:

> I guess this is more of a JVM question than a Clojure question, unless
> Clojure exerts any special magic here. I'm open to a more Clojure approach
> than what I have now.
>
> Someone suggested I use Executors/newScheduledThreadPool for some
> recurring work, so I set it up like this:
>
> (def scheduler-aggregate
>   (Executors/newScheduledThreadPool 32))
>
> at the start I call:
>
>   (.scheduleAtFixedRate  scheduler-aggregate ^Runnable (cycle-aggregate
> to-database-queue) 1 30 TimeUnit/MINUTES)
>
> Aside from a try/catch block (which I just removed to simplify this
> example) the inner function looks like this:
>
> (defn- cycle-aggregate
>   [to-database-queue]
>   (fn []
>      (let [
>            transcripts (query @global-database-connection {:item-type
> :transcript :processed { operators/$exists false }})
>            ]
>        (doseq [x transcripts]
>          (aggregate-words x)
>          (set-transcript-processed  @global-database-connection x)))
>
> The function (aggregate-words) counts up a bunch of words, doing some prep
> work for a later NLP engine, and then there is this line:
>
>     (log "The end of aggregate-words.")))
>
> The whole process takes about 5 minutes to run, about 300 seconds. I watch
> the database and I see the number of new records increase. About every 10
> seconds I see these words appear in the logs:
>
> "The end of aggregate-words."
>
> At the end of 5 minutes, these words have appeared 30 times, one for each
> of the transcripts I'm importing.
>
> This seems like I've done something wrong? Since the words "The end of
> aggregate-words."
> appear at roughly equal intervals, and the transcripts are all about the
> same size, it seems that all of the transcripts are being handled on one
> thread. After all, if the 30 transcripts were handled on 30 threads, I'd
> expect the 30 calls to aggregate-words would all end at roughly the same
> time, instead of sequentially.
>
> What else do I need to do to parallelize this work? If I call (future)
> inside of aggregate-words, would the new thread come from the pool? Is
> there a way I can call aggregate-words and make sure it runs on its own
> thread from the pool?
>
>
>
>
>
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to