Hi,

Your actor code is faulty, you are closing over the actor's state in Future
callbacks, namely future.foreach and future.onFailure. This is not thread
safe and will fail in various interesting ways.

There is not much reason to use an ActorPublisher for this to be honest.
There are built-in combinators to achieve similar things with less chance
for mistakes:

http://doc.akka.io/api/akka/2.4/index.html#akka.stream.scaladsl.Source$@unfoldResource[T,S](create:()=
>S,read:S=>Option[T],close:S=>Unit):akka.stream.scaladsl.Source[T,akka.NotUsed]

http://doc.akka.io/api/akka/2.4/index.html#akka.stream.scaladsl.Source$@unfoldResourceAsync[T,S](create:()=
>scala.concurrent.Future[S],read:S=>scala.concurrent.Future[Option[T]],close:S=>scala.concurrent.Future[akka.Done]):akka.stream.scaladsl.Source[T,akka.NotUsed]

-Endre

On Mon, Jul 25, 2016 at 1:11 AM, Roy Russo <royru...@gmail.com> wrote:

> Scenario: Read from one database using an ActorPublisher, write to another
> database using a subscriber.
>
> I expect the reads to be much faster than the writes, so we need to slow
> down the reads at some threshold. Growing an unbounded queue of data, will
> simply OOM. The below works for small datasets. With large datasets, the
> gap between read-write becomes enormous and so OOM.
>
> My ActorPublisher:
>
> class ScrollPublisher(clientFrom: ElasticClient, config: Config) extends 
> ActorPublisher[SearchHits] {
>
>   val logger = Logger(LoggerFactory.getLogger(this.getClass))
>   var readCount = 0
>   var processing = false
>
>   import akka.stream.actor.ActorPublisherMessage._
>
>   @volatile var executeQuery = () => clientFrom.execute {
>     search in config.indexFrom / config.mapping scroll "30m" limit 
> config.scrollSize
>   }
>
>   def nextHits(): Unit = {
>     if (!processing) {
>       processing = true
>       val future = executeQuery()
>       future.foreach {
>         response =>
>           processing = false
>           if (response.getHits.hits.nonEmpty) {
>             logger.info("Fetched: \t" + response.getHits.getHits.length + " 
> documents in\t" + response.getTookInMillis + "ms.")
>             readCount += response.getHits.getHits.length
>             logger.info("Total Fetched:\t" + readCount)
>             if (isActive && totalDemand > 0) {
>               executeQuery = () => clientFrom.execute {
>                 searchScroll(response.getScrollId).keepAlive("30m")
>               }
>               nextHits()
>               onNext(response.getHits) // sends elements to the stream
>             }
>           } else {
>             onComplete()
>           }
>       }
>       future.onFailure {
>         case t =>
>           processing = false
>           throw t
>       }
>     }
>   }
>
>   def receive = {
>     case Request(cnt) =>
>       logger.info("ActorPublisher Received: \t" + cnt)
>       if (isActive && totalDemand > 0) {
>         nextHits()
>       }
>     case Cancel =>
>       context.stop(self)
>     case _ =>
>   }
> }
>
> Enter code here...
>
>
> Source declaration:
>
> // SearchHits Akka Stream Source
> val documentSource = Source.actorPublisher[SearchHits](Props(new 
> ScrollPublisher(clientFrom, config))).map {
>   case searchHits =>
>     searchHits.getHits
> }
>
>
> My Sink, which performs an asynch write to the new database:
>
> documentSource.buffer(16, OverflowStrategy.backpressure).runWith(Sink.foreach 
> {
>   searchHits =>
>     Thread.sleep(1000)
>     totalRec += searchHits.size
>     logger.info("\t\t\tRECEIVED: " + searchHits.size + " \t\t\t TOTAL 
> RECEIVED: "+ totalRec)
>     val bulkIndexes = searchHits.map(hit => (hit.`type`, hit.id, 
> hit.sourceAsString())).collect {
>       case (typ, _id, source) =>
>         index into config.indexTo / config.mapping id _id -> typ doc 
> JsonDocumentSource(source)
>     }
>     val future = clientTo.execute {
>       bulk(
>         bulkIndexes
>       )
>     }
>
>
>
> The sleep is put in there to simulate lag for local development. I've
> tried changing values for the buffer, and the max/initial values for the
> materializer, and still it seems to ignore back pressure.
>
> Is there a logic flaw in this code?
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to