[ https://issues.apache.org/jira/browse/BEAM-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aviem Zur updated BEAM-1294: ---------------------------- Summary: Long running UnboundedSource Readers (was: Long running UnboundedSource Readers via Broadcasts) > Long running UnboundedSource Readers > ------------------------------------ > > Key: BEAM-1294 > URL: https://issues.apache.org/jira/browse/BEAM-1294 > Project: Beam > Issue Type: Improvement > Components: runner-spark > Reporter: Amit Sela > Assignee: Aviem Zur > > When reading from an UnboundedSource, current implementation will cause each > split to create a new Reader every micro-batch. > As long as the overhead of creating a reader is relatively low, it's > reasonable (though I'd still be happy to get rid of), but in cases where the > creation overhead is large it becomes unreasonable forcing large batches. > One way to solve this could be to create a pool of lazy-init readers to serve > each executor, maybe via Broadcast variables. -- This message was sent by Atlassian JIRA (v6.3.15#6346)