[GitHub] [samza] shanthoosh merged pull request #953: Set job coordinator replication factor configuration for standalone.
shanthoosh merged pull request #953: Set job coordinator replication factor configuration for standalone. URL: https://github.com/apache/samza/pull/953 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza] shanthoosh commented on issue #953: Set job coordinator replication factor configuration for standalone.
shanthoosh commented on issue #953: Set job coordinator replication factor configuration for standalone. URL: https://github.com/apache/samza/pull/953#issuecomment-473483420 @vjagadish1989 Can you take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza] shanthoosh opened a new pull request #953: Set job coordinator replication factor configuration for standalone.
shanthoosh opened a new pull request #953: Set job coordinator replication factor configuration for standalone. URL: https://github.com/apache/samza/pull/953 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [samza] rmatharu opened a new pull request #952: Improved standby-aware container allocation for active-containers on job redeploys
rmatharu opened a new pull request #952: Improved standby-aware container allocation for active-containers on job redeploys URL: https://github.com/apache/samza/pull/952 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: Error handling
Hi Tom, This would depend on what your k8s container orchestration logic looks like. For example, in YARN, 'status' returns 'not running' after 'start' until all the containers requested from the AM are 'running'. We also leverage YARN to restart containers/job automatically on failures (within some bounds). Additionally, we set up a monitoring alert that goes off if the number of running containers stays lower than the number of expected containers for extended periods of time (~ 5 minutes). Are you saying that you noticed that the LocalApplicationRunner status returns 'running' even if its stream processor / SamzaContainer has stopped processing? - Prateek On Fri, Mar 15, 2019 at 7:26 AM Tom Davis wrote: > I'm using the LocalApplicationRunner and had added a liveness check > around the `status` method. The app is running in Kubernetes so, in > theory, it could be restarted if exceptions happened during processing. > However, it seems that "container failure" is divorced from "app > failure" because the app continues to run even after all the task > containers have shut down. Is there a better way to check for > application health? Is there a way to shut down the application if all > containers have failed? Should I simply ensure exceptions never escape > operators? Thanks! >
Re: [VOTE] Apache Samza 1.1.0 RC2
+1 (Non-binding) -- thanks rayman On Wed, Mar 13, 2019 at 7:17 PM Daniel Chen wrote: > Hi, > > I performed the following verifications: > > 1. ./bin/check-all.sh succeeded. > > 2. Verified both ./bin/integration-tests.sh yarn-integration-tests and > ./bin/integration-tests.sh standalone-integration-tests succeeded. > > 3. Verified that SQL console available in samza-tool.tgz. > > +1 (Non-binding) > > > Thanks, > > Daniel > > > On Tue, Mar 12, 2019 at 4:11 PM santhosh venkat < > santhoshvenkat1...@gmail.com> wrote: > > > Hi, > > > > This is a call for a vote on a release of Apache Samza 1.1.0. Thanks to > > everyone who has contributed to this release. > > > > The release candidate can be downloaded from here: > > http://home.apache.org/~shanthoosh/samza-1.1.0-rc2/ > > > > The release candidate is signed with pgp key 0xF8B95961A401BF0F, which > can > > be found > > http://keyserver.ubuntu.com/pks/lookup?op=get&search=0xF8B95961A401BF0F > > > > The git tag is release-1.1.0-rc0 and signed with the same pgp key: > > > > > https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc2 > > > > Test binaries have been published to Maven's staging repository, and are > > available here: > > https://repository.apache.org/content/repositories/orgapachesamza-1060/ > > > > The vote will be open for 72 hours (ending at 16:30 PM PST Thursday, > > 03/15/2018). > > > > Please download the release candidate, check the hashes/signature, build > it > > and test it, and then please vote: > > > > [ ] +1 approve > > > > [ ] +0 no opinion > > > > [ ] -1 disapprove (and reason why) > > > > I ran check-all.sh, integration tests and verified the SQL console > > in samza-tool tgz. > > > > +1 (non-binding) from my side. > > > > Thanks, > > > -- thanks rayman
Error handling
I'm using the LocalApplicationRunner and had added a liveness check around the `status` method. The app is running in Kubernetes so, in theory, it could be restarted if exceptions happened during processing. However, it seems that "container failure" is divorced from "app failure" because the app continues to run even after all the task containers have shut down. Is there a better way to check for application health? Is there a way to shut down the application if all containers have failed? Should I simply ensure exceptions never escape operators? Thanks!