[GitHub] [samza] shanthoosh merged pull request #953: Set job coordinator replication factor configuration for standalone.

2019-03-15 Thread GitBox
shanthoosh merged pull request #953: Set job coordinator replication factor 
configuration for standalone.
URL: https://github.com/apache/samza/pull/953
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] shanthoosh commented on issue #953: Set job coordinator replication factor configuration for standalone.

2019-03-15 Thread GitBox
shanthoosh commented on issue #953: Set job coordinator replication factor 
configuration for standalone.
URL: https://github.com/apache/samza/pull/953#issuecomment-473483420
 
 
   @vjagadish1989
   Can you take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] shanthoosh opened a new pull request #953: Set job coordinator replication factor configuration for standalone.

2019-03-15 Thread GitBox
shanthoosh opened a new pull request #953: Set job coordinator replication 
factor configuration for standalone.
URL: https://github.com/apache/samza/pull/953
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [samza] rmatharu opened a new pull request #952: Improved standby-aware container allocation for active-containers on job redeploys

2019-03-15 Thread GitBox
rmatharu opened a new pull request #952: Improved standby-aware container 
allocation for active-containers on job redeploys
URL: https://github.com/apache/samza/pull/952
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Error handling

2019-03-15 Thread Prateek Maheshwari
Hi Tom,

This would depend on what your k8s container orchestration logic looks
like. For example, in YARN, 'status' returns 'not running' after 'start'
until all the containers requested from the AM are 'running'. We also
leverage YARN to restart containers/job automatically on failures (within
some bounds). Additionally, we set up a monitoring alert that goes off if
the number of running containers stays lower than the number of expected
containers for extended periods of time (~ 5 minutes).

Are you saying that you noticed that the LocalApplicationRunner status
returns 'running' even if its stream processor / SamzaContainer has stopped
processing?

- Prateek

On Fri, Mar 15, 2019 at 7:26 AM Tom Davis  wrote:

> I'm using the LocalApplicationRunner and had added a liveness check
> around the `status` method. The app is running in Kubernetes so, in
> theory, it could be restarted if exceptions happened during processing.
> However, it seems that "container failure" is divorced from "app
> failure" because the app continues to run even after all the task
> containers have shut down. Is there a better way to check for
> application health? Is there a way to shut down the application if all
> containers have failed? Should I simply ensure exceptions never escape
> operators? Thanks!
>


Re: [VOTE] Apache Samza 1.1.0 RC2

2019-03-15 Thread rayman preet
+1 (Non-binding)

--
thanks
rayman

On Wed, Mar 13, 2019 at 7:17 PM Daniel Chen  wrote:

> Hi,
>
> I performed the following verifications:
>
> 1. ./bin/check-all.sh succeeded.
>
> 2. Verified both ./bin/integration-tests.sh yarn-integration-tests and
> ./bin/integration-tests.sh standalone-integration-tests succeeded.
>
> 3. Verified that SQL console available in samza-tool.tgz.
>
> +1 (Non-binding)
>
>
> Thanks,
>
> Daniel
>
>
> On Tue, Mar 12, 2019 at 4:11 PM santhosh venkat <
> santhoshvenkat1...@gmail.com> wrote:
>
> > Hi,
> >
> > This is a call for a vote on a release of Apache Samza 1.1.0. Thanks to
> > everyone who has contributed to this release.
> >
> > The release candidate can be downloaded from here:
> > http://home.apache.org/~shanthoosh/samza-1.1.0-rc2/
> >
> > The release candidate is signed with pgp key 0xF8B95961A401BF0F, which
> can
> > be found
> > http://keyserver.ubuntu.com/pks/lookup?op=get&search=0xF8B95961A401BF0F
> >
> > The git tag is release-1.1.0-rc0 and signed with the same pgp key:
> >
> >
> https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc2
> >
> > Test binaries have been published to Maven's staging repository, and are
> > available here:
> > https://repository.apache.org/content/repositories/orgapachesamza-1060/
> >
> > The vote will be open for 72 hours (ending at 16:30 PM PST Thursday,
> > 03/15/2018).
> >
> > Please download the release candidate, check the hashes/signature, build
> it
> > and test it, and then please vote:
> >
> > [ ] +1 approve
> >
> > [ ] +0 no opinion
> >
> > [ ] -1 disapprove (and reason why)
> >
> > I ran check-all.sh, integration tests and verified the SQL console
> > in samza-tool tgz.
> >
> > +1 (non-binding) from my side.
> >
> > Thanks,
> >
>


-- 
thanks
rayman


Error handling

2019-03-15 Thread Tom Davis

I'm using the LocalApplicationRunner and had added a liveness check
around the `status` method. The app is running in Kubernetes so, in
theory, it could be restarted if exceptions happened during processing.
However, it seems that "container failure" is divorced from "app
failure" because the app continues to run even after all the task
containers have shut down. Is there a better way to check for
application health? Is there a way to shut down the application if all
containers have failed? Should I simply ensure exceptions never escape
operators? Thanks!