[jira] [Updated] (SAMZA-880) Moving to github/pull-request for code review and check-in
[ https://issues.apache.org/jira/browse/SAMZA-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navina Ramesh updated SAMZA-880: Fix Version/s: 0.11.0 > Moving to github/pull-request for code review and check-in > -- > > Key: SAMZA-880 > URL: https://issues.apache.org/jira/browse/SAMZA-880 > Project: Samza > Issue Type: Task >Reporter: Yi Pan (Data Infrastructure) >Assignee: Jagadish > Fix For: 0.11.0 > > Attachments: MovingSamzatoPullRequests.pdf, SAMZA-880-1.patch, > SAMZA-880.0.patch > > > As per mailing list > [discussion|http://mail-archives.apache.org/mod_mbox/samza-dev/201602.mbox/%3ccafvexu230gjja7z99lzn6bhnekypjrnof3kyc+dvgko_znq...@mail.gmail.com%3E], > we are going to change the code review and commit process to use github pull > request. This would require a few things: > 1) incorporate kafka-merge-pr.py from [kafka > repo|https://github.com/apache/kafka/blob/trunk/kafka-merge-pr.py] to Samza > that include the merge steps for github pull requests to Apache Samza git repo > 2) update the documentation in contributor corner for the new developer > 3) Document the commiter instructions on using kafka-merge-pr.py. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-1030) Add documentation for change in the contribution process
[ https://issues.apache.org/jira/browse/SAMZA-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534658#comment-15534658 ] ASF GitHub Bot commented on SAMZA-1030: --- Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/16 > Add documentation for change in the contribution process > > > Key: SAMZA-1030 > URL: https://issues.apache.org/jira/browse/SAMZA-1030 > Project: Samza > Issue Type: Sub-task >Reporter: Navina Ramesh >Assignee: Navina Ramesh > Fix For: 0.11 > > > We have moved away from RB for contributing patches to Apache Samza and want > to use Github Pull Request. Need documentation in the website on the new > process workflow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
samza git commit: SAMZA-1030; Add documentation for change in the contribution process
Repository: samza Updated Branches: refs/heads/master e6956a37b -> 36bcbacc2 SAMZA-1030; Add documentation for change in the contribution process I re-organized some of the website files and created a "Contributor's Corner" that collates all info related to new contributors and committers Also, updated the merge script with constants to match from the website documentation. Author: Navina RameshReviewers: Jagadish Closes #16 from navina/adding-docs Project: http://git-wip-us.apache.org/repos/asf/samza/repo Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/36bcbacc Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/36bcbacc Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/36bcbacc Branch: refs/heads/master Commit: 36bcbacc27faf07621fad44e8edea58eafb94fae Parents: e6956a3 Author: Navina Ramesh Authored: Thu Sep 29 18:09:04 2016 -0700 Committer: vjagadish1989 Committed: Thu Sep 29 18:09:04 2016 -0700 -- bin/merge-pull-request.py | 4 +- docs/_layouts/default.html | 4 +- docs/contribute/code.md| 6 +- docs/contribute/coding-guide.md| 7 ++ docs/contribute/contributors-corner.md | 116 docs/contribute/projects.md| 34 docs/contribute/rules.md | 40 -- 7 files changed, 128 insertions(+), 83 deletions(-) -- http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/bin/merge-pull-request.py -- diff --git a/bin/merge-pull-request.py b/bin/merge-pull-request.py index 46c67e4..5f1dae7 100755 --- a/bin/merge-pull-request.py +++ b/bin/merge-pull-request.py @@ -51,9 +51,9 @@ PROJECT_NAME = "samza" CAPITALIZED_PROJECT_NAME = "samza".upper() # Remote name which points to the GitHub site -PR_REMOTE_NAME = os.environ.get("PR_REMOTE_NAME", "apache-github") +PR_REMOTE_NAME = os.environ.get("PR_REMOTE_NAME", "samza-github") # Remote name which points to Apache git -PUSH_REMOTE_NAME = os.environ.get("PUSH_REMOTE_NAME", "apache") +PUSH_REMOTE_NAME = os.environ.get("PUSH_REMOTE_NAME", "samza-apache") # ASF JIRA username JIRA_USERNAME = os.environ.get("JIRA_USERNAME", "") # ASF JIRA password http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/docs/_layouts/default.html -- diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html index d20e603..60e56b5 100644 --- a/docs/_layouts/default.html +++ b/docs/_layouts/default.html @@ -92,12 +92,10 @@ Contribute - Rules + Contributor's Corner Coding Guide - Projects Design Documents Code - https://reviews.apache.org/groups/samza;>Review Board Tests http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/docs/contribute/code.md -- diff --git a/docs/contribute/code.md b/docs/contribute/code.md index be85c8d..2ebe354 100644 --- a/docs/contribute/code.md +++ b/docs/contribute/code.md @@ -27,8 +27,6 @@ You can check out Samza's code by running: git clone http://git-wip-us.apache.org/repos/asf/samza.git ``` -Please see the [Rules](rules.html) page for information on how to contribute. +Official releases can be downloaded from [here](/startup/download) -If you are a committer you need to use https instead of http to check in, otherwise you will get an error regarding an inability to acquire a lock. Note that older versions of git may also give this error even when the repo was cloned with https; if you experience this try a newer version of git. - -The Samza website is built by Jekyll from the markdown files found in the docs subdirectory. For committers wishing to update the webpage, please see `docs/README.md` for instructions. +Please see [Contributor's Corner'](rules.html) for information on how to contribute. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/docs/contribute/coding-guide.md -- diff --git a/docs/contribute/coding-guide.md b/docs/contribute/coding-guide.md index 659834f..260c488 100644 --- a/docs/contribute/coding-guide.md +++ b/docs/contribute/coding-guide.md @@ -22,6 +22,13 @@ title: Coding Guide These guidelines are meant to encourage consistency and best practices amongst people working on the Samza code base. They should be observed unless there is a compelling reason to ignore them. +In general, patches should include: + +* Code change +* Unit tests +* Javadocs +*
[jira] [Commented] (SAMZA-1030) Add documentation for change in the contribution process
[ https://issues.apache.org/jira/browse/SAMZA-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534574#comment-15534574 ] ASF GitHub Bot commented on SAMZA-1030: --- GitHub user navina opened a pull request: https://github.com/apache/samza/pull/16 SAMZA-1030 : Add documentation for change in the contribution process I re-organized some of the website files and created a "Contributor's Corner" that collates all info related to new contributors and committers Also, updated the merge script with constants to match from the website documentation. You can merge this pull request into a Git repository by running: $ git pull https://github.com/navina/samza adding-docs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/16.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16 commit 3ac500d766475af5eebbf6a8c8b258c94e2666fa Author: Navina RameshDate: 2016-09-30T00:28:04Z Adding doc for contributors page and fixing code in merge script > Add documentation for change in the contribution process > > > Key: SAMZA-1030 > URL: https://issues.apache.org/jira/browse/SAMZA-1030 > Project: Samza > Issue Type: Sub-task >Reporter: Navina Ramesh >Assignee: Navina Ramesh > Fix For: 0.11.0 > > > We have moved away from RB for contributing patches to Apache Samza and want > to use Github Pull Request. Need documentation in the website on the new > process workflow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SAMZA-1030) Add documentation for change in the contribution process
Navina Ramesh created SAMZA-1030: Summary: Add documentation for change in the contribution process Key: SAMZA-1030 URL: https://issues.apache.org/jira/browse/SAMZA-1030 Project: Samza Issue Type: Sub-task Reporter: Navina Ramesh Assignee: Navina Ramesh Fix For: 0.11.0 We have moved away from RB for contributing patches to Apache Samza and want to use Github Pull Request. Need documentation in the website on the new process workflow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[samza] Git Push Summary
Repository: samza Updated Tags: refs/tags/release-0.11.0-rc0 [created] 7271e02af
[jira] [Commented] (SAMZA-1029) Prepare release candidate for 0.11.0
[ https://issues.apache.org/jira/browse/SAMZA-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534353#comment-15534353 ] Xinyu Liu commented on SAMZA-1029: -- The list of work items: - prepare binary artifacts and make the source tar ball available for testing (before vote) - officially upload artifacts to Apache repo (after vote passed) - update and release documents, following docs/README.md, specifically "Release-new-version Website Checklist" section. After this is done, we can officially update Samza BLOG with the new release. > Prepare release candidate for 0.11.0 > > > Key: SAMZA-1029 > URL: https://issues.apache.org/jira/browse/SAMZA-1029 > Project: Samza > Issue Type: Task >Reporter: Xinyu Liu >Assignee: Xinyu Liu > Fix For: 0.11, 0.11.0 > > > Upgrade to Samza 0.11. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[samza] Git Push Summary
Repository: samza Updated Branches: refs/heads/0.11.0 [created] e6956a37b
[jira] [Created] (SAMZA-1029) Prepare release candidate for 0.11.0
Xinyu Liu created SAMZA-1029: Summary: Prepare release candidate for 0.11.0 Key: SAMZA-1029 URL: https://issues.apache.org/jira/browse/SAMZA-1029 Project: Samza Issue Type: Task Reporter: Xinyu Liu Assignee: Xinyu Liu Fix For: 0.11, 0.11.0 Upgrade to Samza 0.11. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
samza git commit: Update version for 0.11.0 release
Repository: samza Updated Branches: refs/heads/master 61d35f26c -> e6956a37b Update version for 0.11.0 release Project: http://git-wip-us.apache.org/repos/asf/samza/repo Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/e6956a37 Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/e6956a37 Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/e6956a37 Branch: refs/heads/master Commit: e6956a37ba358b45b12df2ceeb6add5834fc8dd6 Parents: 61d35f2 Author: Xinyu LiuAuthored: Thu Sep 29 15:45:17 2016 -0700 Committer: Xinyu Liu Committed: Thu Sep 29 15:45:17 2016 -0700 -- gradle.properties | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/samza/blob/e6956a37/gradle.properties -- diff --git a/gradle.properties b/gradle.properties index 16e1f5d..c94f619 100644 --- a/gradle.properties +++ b/gradle.properties @@ -15,7 +15,7 @@ # specific language governing permissions and limitations # under the License. group=org.apache.samza -version=0.10.1-SNAPSHOT +version=0.11.0 scalaVersion=2.10 gradleVersion=2.0
[jira] [Commented] (SAMZA-681) Create a unit test harness to easily test samza tasks
[ https://issues.apache.org/jira/browse/SAMZA-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534245#comment-15534245 ] Yi Pan (Data Infrastructure) commented on SAMZA-681: [~joseangel], assigned to you. Thanks! > Create a unit test harness to easily test samza tasks > - > > Key: SAMZA-681 > URL: https://issues.apache.org/jira/browse/SAMZA-681 > Project: Samza > Issue Type: Test > Components: test >Affects Versions: 0.9.0 >Reporter: Luis De Pombo >Assignee: Jose Martinez > Attachments: SAMZA-381.patch > > > We don't have a standardized test framework for unit testing samza tasks. > Currently, in order to test the proper functionality of a task we have to > bring up the system (kafka) and its dependencies (zk). > Goals: > - The harness should abstract several notions around the system and its > partitions, without mocking them, and hold task context and state in memory. > - The harness should let you inject severalpairs that would > normally be passed in by the `SystemStream` provided in `tasks.input`. > - The same harness should offer an interface to visualize all the > corresponding pairs collected in the message `Collector`, which > are the output of the `process`ed or `window`ed input pairs > previously injected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SAMZA-681) Create a unit test harness to easily test samza tasks
[ https://issues.apache.org/jira/browse/SAMZA-681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Pan (Data Infrastructure) updated SAMZA-681: --- Assignee: Jose Martinez > Create a unit test harness to easily test samza tasks > - > > Key: SAMZA-681 > URL: https://issues.apache.org/jira/browse/SAMZA-681 > Project: Samza > Issue Type: Test > Components: test >Affects Versions: 0.9.0 >Reporter: Luis De Pombo >Assignee: Jose Martinez > Attachments: SAMZA-381.patch > > > We don't have a standardized test framework for unit testing samza tasks. > Currently, in order to test the proper functionality of a task we have to > bring up the system (kafka) and its dependencies (zk). > Goals: > - The harness should abstract several notions around the system and its > partitions, without mocking them, and hold task context and state in memory. > - The harness should let you inject severalpairs that would > normally be passed in by the `SystemStream` provided in `tasks.input`. > - The same harness should offer an interface to visualize all the > corresponding pairs collected in the message `Collector`, which > are the output of the `process`ed or `window`ed input pairs > previously injected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference
[ https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadish closed SAMZA-1028. --- > Moving logline in callback thread and making exception thrown AtomicReference > - > > Key: SAMZA-1028 > URL: https://issues.apache.org/jira/browse/SAMZA-1028 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Xinyu Liu >Assignee: Xinyu Liu > Fix For: 0.11 > > > Current the error log happens after produce close and reset the exception in > later callbacks, which caused the trouble shooting to be harder in cases of > multithreading. We should log error before closing and keep atomic reference > of the initial exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference
[ https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534186#comment-15534186 ] Jagadish commented on SAMZA-1028: - Thanks for the patch, Merged and submitted! > Moving logline in callback thread and making exception thrown AtomicReference > - > > Key: SAMZA-1028 > URL: https://issues.apache.org/jira/browse/SAMZA-1028 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Xinyu Liu >Assignee: Xinyu Liu > Fix For: 0.11 > > > Current the error log happens after produce close and reset the exception in > later callbacks, which caused the trouble shooting to be harder in cases of > multithreading. We should log error before closing and keep atomic reference > of the initial exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference
[ https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadish resolved SAMZA-1028. - Resolution: Fixed > Moving logline in callback thread and making exception thrown AtomicReference > - > > Key: SAMZA-1028 > URL: https://issues.apache.org/jira/browse/SAMZA-1028 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Xinyu Liu >Assignee: Xinyu Liu > Fix For: 0.11 > > > Current the error log happens after produce close and reset the exception in > later callbacks, which caused the trouble shooting to be harder in cases of > multithreading. We should log error before closing and keep atomic reference > of the initial exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
samza git commit: SAMZA-1028: Fix KafkaSystemProducer logging and use an AtomicReference for tracking producer exceptions
Repository: samza Updated Branches: refs/heads/master 317b6ff1b -> 61d35f26c SAMZA-1028: Fix KafkaSystemProducer logging and use an AtomicReference for tracking producer exceptions Project: http://git-wip-us.apache.org/repos/asf/samza/repo Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/61d35f26 Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/61d35f26 Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/61d35f26 Branch: refs/heads/master Commit: 61d35f26c1f924f376b30f60d05c1957364c05e5 Parents: 317b6ff Author: Xinyu LiuAuthored: Thu Sep 29 14:51:23 2016 -0700 Committer: vjagadish1989 Committed: Thu Sep 29 14:52:27 2016 -0700 -- .../system/kafka/KafkaSystemProducer.scala | 69 +--- 1 file changed, 47 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/samza/blob/61d35f26/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala -- diff --git a/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala b/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala index 5ff6d3c..aac53fc 100644 --- a/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala +++ b/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala @@ -20,8 +20,8 @@ package org.apache.samza.system.kafka -import java.util.concurrent.ConcurrentHashMap -import java.util.concurrent.Future +import java.util.concurrent.atomic.AtomicReference +import java.util.concurrent.{TimeUnit, ConcurrentHashMap, Future} import org.apache.kafka.clients.producer.Callback import org.apache.kafka.clients.producer.Producer @@ -56,12 +56,13 @@ class KafkaSystemProducer(systemName: String, @volatile var latestFuture: Future[RecordMetadata] = null /** - * exceptionThrown: to store the exception in case of any "ultimate" send failure (ie. failure + * exceptionInCallback: to store the exception in case of any "ultimate" send failure (ie. failure * after exhausting max_retries in Kafka producer) in the I/O thread, we do not continue to queue up more send * requests from the samza thread. It helps the samza thread identify if the failure happened in I/O thread or not. + * + * In cases of multiple exceptions in the callbacks, we keep the first one before throwing. */ -@volatile -var exceptionThrown: SamzaException = null +var exceptionInCallback: AtomicReference[SamzaException] = new AtomicReference[SamzaException]() } @volatile var producer: Producer[Array[Byte], Array[Byte]] = null @@ -80,13 +81,13 @@ class KafkaSystemProducer(systemName: String, producer = null sources.foreach {p => -if (p._2.exceptionThrown == null) { +if (p._2.exceptionInCallback.get() == null) { flush(p._1) } } } } catch { -case e: Exception => logger.error(e.getMessage, e) +case e: Exception => error(e.getMessage, e) } } } @@ -97,6 +98,21 @@ class KafkaSystemProducer(systemName: String, } } + def closeAndNullifyCurrentProducer(currentProducer: Producer[Array[Byte], Array[Byte]]) { +try { + // TODO: we should use timeout close() to make sure we fail all waiting messages in kafka 0.9+ + currentProducer.close() +} catch { + case e: Exception => error("producer close failed", e) +} +producerLock.synchronized { + if (currentProducer == producer) { +// only nullify the member producer if it is still the same object, no point nullifying new producer +producer = null + } +} + } + def send(source: String, envelope: OutgoingMessageEnvelope) { trace("Enqueuing message: %s, %s." format (source, envelope)) @@ -110,10 +126,10 @@ class KafkaSystemProducer(systemName: String, throw new IllegalArgumentException("Source %s must be registered first before send." format source) } -val exception = sourceData.exceptionThrown +val exception = sourceData.exceptionInCallback.getAndSet(null) if (exception != null) { metrics.sendFailed.inc - throw exception + throw exception // in case the caller catches all exceptions and will try again } // lazy initialization of the producer @@ -134,8 +150,7 @@ class KafkaSystemProducer(systemName: String, // Java-based Kafka producer API requires an "Integer" type partitionKey and does not allow custom overriding of Partitioners // Any kind of custom partitioning has to be done on the client-side val partitions: java.util.List[PartitionInfo] =
[jira] [Commented] (SAMZA-1027) InstanceAlreadyExistsException while starting up
[ https://issues.apache.org/jira/browse/SAMZA-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534154#comment-15534154 ] Xinyu Liu commented on SAMZA-1027: -- [~ncolomer]: could you please check whether you have producers for multiple kafka systems? In SAMZA-981, we changed the way to generate clientId for kafka consumers/producers to be based on just job name and job instance in order to support kafka quotas (http://kafka.apache.org/documentation.html#design_quotas). But it revealed a bug in kafka that client id is also used to register the metrics mbeans. So if you have kafka producers for multiple systems, we will create multiple producers sharing the same clientId. This caused the mbeans naming collision. According to kafka, the only side effect is that some of the kafka metrics won't be reported through mbeans. Does that affect you? We are working with Kafka to get a fix for it in upcoming kafka version. > InstanceAlreadyExistsException while starting up > > > Key: SAMZA-1027 > URL: https://issues.apache.org/jira/browse/SAMZA-1027 > Project: Samza > Issue Type: Bug > Components: kafka >Affects Versions: 0.11 >Reporter: Nicolas Colomer > > After upgrading Samza, I started to see following WARN log while starting a > Samza job: > {code} > 2016-09-29 10:30:09 AppInfoParser [WARN] task[] ssp[] offset[] Error > registering AppInfo mbean > javax.management.InstanceAlreadyExistsException: > kafka.producer:type=app-info,id=samza_producer-my_awesome_job-123 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:58) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:328) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:163) > at > org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89) > at > org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89) > at > org.apache.samza.system.kafka.KafkaSystemProducer.send(KafkaSystemProducer.scala:124) > at > org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:113) > at > org.apache.samza.coordinator.stream.AbstractCoordinatorStreamManager.send(AbstractCoordinatorStreamManager.java:72) > at > org.apache.samza.container.LocalityManager.writeContainerToHostMapping(LocalityManager.java:134) > at > org.apache.samza.container.SamzaContainer.startLocalityManager(SamzaContainer.scala:739) > at > org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:651) > at > org.apache.samza.container.SamzaContainer$.safeMain(SamzaContainer.scala:116) > at > org.apache.samza.container.SamzaContainer$.main(SamzaContainer.scala:90) > at > org.apache.samza.container.SamzaContainer.main(SamzaContainer.scala) > {code} > More precisely, in my situation, this log occurs twice during startup of the > job. > I figured out that it follows the creation of 2 producers: {{samza_producer}} > and {{samza_checkpoint_manager}} with client id respectively equals to > {{samza_producer-my_awesome_job-123}} and > {{samza_checkpoint_manager-my_awesome_job-123}}. > This issue seems to be directly related to SAMZA-981, that remove > discriminants timestamp plus unique counter value from the {{client.id}} > string. > According to KAFKA-3992, this error occurs when multiple producers / > consumers are created with the same {{client.id}} setting. > Looking at the source code, there is a > [lock|https://github.com/ncolomer/samza/blob/17e65d1cbdda1ad436f47c15fa4c86332e229a93/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala#L119-L127] > in {{KafkaSystemProducer}} that should prevent any race condition where this > could happen. But is there any case where this class (or the {{getProducer}} > lambda function) may be instantiated/reused multiple time in the same JVM > host? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-995) Remove configs deprecated in 0.10.* release and update configs
[ https://issues.apache.org/jira/browse/SAMZA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533729#comment-15533729 ] Jagadish commented on SAMZA-995: Thanks Xinyu for the patch, merged and submitted. > Remove configs deprecated in 0.10.* release and update configs > -- > > Key: SAMZA-995 > URL: https://issues.apache.org/jira/browse/SAMZA-995 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Navina Ramesh >Assignee: Xinyu Liu > Fix For: 0.11.0 > > > "yarn.container.count" was deprecated in 0.10.0 and we replaced it with > "job.container.count". We should remove the warning message, remove > deprecated config and update configurations tables as well. > Please also scan for more such variables (if any). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (SAMZA-1020) Remove some deprecated interfaces from 0.10 version
[ https://issues.apache.org/jira/browse/SAMZA-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadish resolved SAMZA-1020. - Resolution: Fixed > Remove some deprecated interfaces from 0.10 version > --- > > Key: SAMZA-1020 > URL: https://issues.apache.org/jira/browse/SAMZA-1020 > Project: Samza > Issue Type: Bug >Reporter: Navina Ramesh >Assignee: Xinyu Liu > Labels: newbie > Fix For: 0.11.0 > > > Some interfaces/methods were deprecated as a part of 0.10.0 release. They > were marked as @Deprecated . However, we yet to remove them. > Some instances I noticed were in these classes: > * readChangeLogPartitionMapping in KafkaCheckpointManager > * isChangelogPartitionMapping, getChangelogPartitionMappingKey in > KafkaCheckpointLogKey ; Probably a few more in this class -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SAMZA-1020) Remove some deprecated interfaces from 0.10 version
[ https://issues.apache.org/jira/browse/SAMZA-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadish updated SAMZA-1020: Assignee: Xinyu Liu > Remove some deprecated interfaces from 0.10 version > --- > > Key: SAMZA-1020 > URL: https://issues.apache.org/jira/browse/SAMZA-1020 > Project: Samza > Issue Type: Bug >Reporter: Navina Ramesh >Assignee: Xinyu Liu > Labels: newbie > Fix For: 0.11.0 > > > Some interfaces/methods were deprecated as a part of 0.10.0 release. They > were marked as @Deprecated . However, we yet to remove them. > Some instances I noticed were in these classes: > * readChangeLogPartitionMapping in KafkaCheckpointManager > * isChangelogPartitionMapping, getChangelogPartitionMappingKey in > KafkaCheckpointLogKey ; Probably a few more in this class -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-1020) Remove some deprecated interfaces from 0.10 version
[ https://issues.apache.org/jira/browse/SAMZA-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533738#comment-15533738 ] Jagadish commented on SAMZA-1020: - Thanks [~xinyu] for the patch. Merged, and submitted. Thanks! > Remove some deprecated interfaces from 0.10 version > --- > > Key: SAMZA-1020 > URL: https://issues.apache.org/jira/browse/SAMZA-1020 > Project: Samza > Issue Type: Bug >Reporter: Navina Ramesh > Labels: newbie > Fix For: 0.11.0 > > > Some interfaces/methods were deprecated as a part of 0.10.0 release. They > were marked as @Deprecated . However, we yet to remove them. > Some instances I noticed were in these classes: > * readChangeLogPartitionMapping in KafkaCheckpointManager > * isChangelogPartitionMapping, getChangelogPartitionMappingKey in > KafkaCheckpointLogKey ; Probably a few more in this class -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference
[ https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533732#comment-15533732 ] Xinyu Liu commented on SAMZA-1028: -- rb: https://reviews.apache.org/r/52403/ > Moving logline in callback thread and making exception thrown AtomicReference > - > > Key: SAMZA-1028 > URL: https://issues.apache.org/jira/browse/SAMZA-1028 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Xinyu Liu >Assignee: Xinyu Liu > Fix For: 0.11 > > > Current the error log happens after produce close and reset the exception in > later callbacks, which caused the trouble shooting to be harder in cases of > multithreading. We should log error before closing and keep atomic reference > of the initial exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-995) Remove configs deprecated in 0.10.* release and update configs
[ https://issues.apache.org/jira/browse/SAMZA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533726#comment-15533726 ] Jagadish commented on SAMZA-995: RB: https://reviews.apache.org/r/52401/ . > Remove configs deprecated in 0.10.* release and update configs > -- > > Key: SAMZA-995 > URL: https://issues.apache.org/jira/browse/SAMZA-995 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Navina Ramesh >Assignee: Xinyu Liu > Fix For: 0.11.0 > > > "yarn.container.count" was deprecated in 0.10.0 and we replaced it with > "job.container.count". We should remove the warning message, remove > deprecated config and update configurations tables as well. > Please also scan for more such variables (if any). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference
Xinyu Liu created SAMZA-1028: Summary: Moving logline in callback thread and making exception thrown AtomicReference Key: SAMZA-1028 URL: https://issues.apache.org/jira/browse/SAMZA-1028 Project: Samza Issue Type: Bug Affects Versions: 0.11 Reporter: Xinyu Liu Assignee: Xinyu Liu Fix For: 0.11 Current the error log happens after produce close and reset the exception in later callbacks, which caused the trouble shooting to be harder in cases of multithreading. We should log error before closing and keep atomic reference of the initial exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] samza git commit: SAMZA-995: Remove deprecated property yarn.container.count from config table.
SAMZA-995: Remove deprecated property yarn.container.count from config table. Project: http://git-wip-us.apache.org/repos/asf/samza/repo Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/317b6ff1 Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/317b6ff1 Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/317b6ff1 Branch: refs/heads/master Commit: 317b6ff1be8b497cd1468fddf3975b2832cf9d74 Parents: db31e89 Author: Xinyu LiuAuthored: Thu Sep 29 11:50:42 2016 -0700 Committer: vjagadish1989 Committed: Thu Sep 29 11:54:49 2016 -0700 -- docs/contribute/coding-guide.md | 2 +- .../documentation/versioned/container/coordinator-stream.md | 2 +- .../documentation/versioned/jobs/configuration-table.html | 9 - docs/learn/documentation/versioned/jobs/reprocessing.md | 2 +- docs/learn/documentation/versioned/jobs/yarn-jobs.md| 2 +- docs/learn/tutorials/versioned/remote-debugging-samza.md| 2 +- 6 files changed, 5 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/contribute/coding-guide.md -- diff --git a/docs/contribute/coding-guide.md b/docs/contribute/coding-guide.md index 6cb4cee..659834f 100644 --- a/docs/contribute/coding-guide.md +++ b/docs/contribute/coding-guide.md @@ -91,7 +91,7 @@ We are following the style guide given here (though not perfectly). Below are so * All configuration names that define a class must end with .class (e.g. task.command.class). * All configuration names that define a factory class must end with .factory.class (e.g. systems.kafka.consumer.factory.class). * Configuration will always be defined as simple key/value pairs (e.g. a=b). -* When configuration is related, it must be grouped using the same prefix (e.g. yarn.container.count=1, yarn.container.memory.bytes=1073741824). +* When configuration is related, it must be grouped using the same prefix (e.g. job.container.count=1, yarn.container.memory.bytes=1073741824). * When configuration must be defined multiple times, the key should be parameterized (e.g. systems.kafka.consumer.factory=x, systems.kestrel.consumer.factory=y). *When such configuration must be referred to, its parameter should be used (e.g. foo.bar.system=kafka, foo.bar.system=kestrel). * All getter methods must be a camel case match with their configuration names (e.g. yarn.package.uri and getYarnPackageUri). * Reading configuration should only be done in factories and main methods. Don't pass Config objects around. http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/learn/documentation/versioned/container/coordinator-stream.md -- diff --git a/docs/learn/documentation/versioned/container/coordinator-stream.md b/docs/learn/documentation/versioned/container/coordinator-stream.md index b680593..6b14960 100644 --- a/docs/learn/documentation/versioned/container/coordinator-stream.md +++ b/docs/learn/documentation/versioned/container/coordinator-stream.md @@ -116,7 +116,7 @@ Samza provides a command line tool to write Job Configuration messages to the co samza-example/target/bin/run-coordinator-stream-writer.sh \ --config-path=file:///path/to/job/config.properties \ --type set-config \ - --key yarn.container.count \ + --key job.container.count \ --value 8 {% endhighlight %} http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/learn/documentation/versioned/jobs/configuration-table.html -- diff --git a/docs/learn/documentation/versioned/jobs/configuration-table.html b/docs/learn/documentation/versioned/jobs/configuration-table.html index 1d16f52..f60cd50 100644 --- a/docs/learn/documentation/versioned/jobs/configuration-table.html +++ b/docs/learn/documentation/versioned/jobs/configuration-table.html @@ -1524,15 +1524,6 @@ -yarn.container.count -1 - -This is deprecated in favor of -job.container.count - - - - yarn.container.memory.mb 1024 http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/learn/documentation/versioned/jobs/reprocessing.md -- diff --git a/docs/learn/documentation/versioned/jobs/reprocessing.md b/docs/learn/documentation/versioned/jobs/reprocessing.md index 174b46f..97d8801 100644 --- a/docs/learn/documentation/versioned/jobs/reprocessing.md +++
[1/2] samza git commit: SAMZA-1020: Remove methods in Kafka checkpoint classes that were deprecated in the 0.10 release.
Repository: samza Updated Branches: refs/heads/master 985822028 -> 317b6ff1b SAMZA-1020: Remove methods in Kafka checkpoint classes that were deprecated in the 0.10 release. Project: http://git-wip-us.apache.org/repos/asf/samza/repo Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/db31e899 Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/db31e899 Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/db31e899 Branch: refs/heads/master Commit: db31e8992c7bcbe17a5ad85bbab6105e232a626a Parents: 9858220 Author: vjagadish1989Authored: Thu Sep 29 11:48:22 2016 -0700 Committer: vjagadish1989 Committed: Thu Sep 29 11:48:22 2016 -0700 -- .../kafka/KafkaCheckpointLogKey.scala | 23 .../kafka/KafkaCheckpointManager.scala | 22 --- .../kafka/TeskKafkaCheckpointLogKey.scala | 10 - 3 files changed, 55 deletions(-) -- http://git-wip-us.apache.org/repos/asf/samza/blob/db31e899/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala -- diff --git a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala index ea8462d..9ed64c3 100644 --- a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala +++ b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala @@ -57,13 +57,6 @@ class KafkaCheckpointLogKey private (val map: Map[String, String]) { */ def isCheckpointKey = getKey.equals(CHECKPOINT_KEY_TYPE) - /** - * Is this key for a changelog partition mapping? - * - * @return true iff this key's entry is for a changelog partition mapping - */ - @Deprecated - def isChangelogPartitionMapping = getKey.equals(CHANGELOG_PARTITION_KEY_TYPE) /** * If this Key is for a checkpoint entry, return its associated TaskName. @@ -98,17 +91,9 @@ object KafkaCheckpointLogKey { val CHECKPOINT_KEY_KEY = "type" val CHECKPOINT_KEY_TYPE = "checkpoint" - @Deprecated - val CHANGELOG_PARTITION_KEY_TYPE = "changelog-partition-mapping" - val CHECKPOINT_TASKNAME_KEY = "taskName" val SYSTEMSTREAMPARTITION_GROUPER_FACTORY_KEY = "systemstreampartition-grouper-factory" - /** - * Partition mapping keys have no dynamic values, so we just need one instance. - */ - @Deprecated - val CHANGELOG_PARTITION_MAPPING_KEY = new KafkaCheckpointLogKey(Map(CHECKPOINT_KEY_KEY -> CHANGELOG_PARTITION_KEY_TYPE)) private val JSON_MAPPER = new ObjectMapper() val KEY_TYPEREFERENCE = new TypeReference[util.HashMap[String, String]]() {} @@ -146,14 +131,6 @@ object KafkaCheckpointLogKey { } /** - * Build a key for a changelog partition mapping entry - * - * @return Key for changelog partition mapping entry - */ - @Deprecated - def getChangelogPartitionMappingKey() = CHANGELOG_PARTITION_MAPPING_KEY - - /** * Deserialize a Kafka checkpoint log key * @param bytes Serialized (via JSON) Kafka checkpoint log key * @return Checkpoint log key http://git-wip-us.apache.org/repos/asf/samza/blob/db31e899/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala -- diff --git a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala index ea10cae..8f18a92 100644 --- a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala +++ b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala @@ -290,28 +290,6 @@ class KafkaCheckpointManager( } } - - /** - * Read through entire log, discarding checkpoints, finding latest changelogPartitionMapping - * To be used for Migration purpose only. In newer version, changelogPartitionMapping will be handled through coordinator stream - */ - @Deprecated - def readChangeLogPartitionMapping(): util.Map[TaskName, java.lang.Integer] = { -var changelogPartitionMapping: util.Map[TaskName, java.lang.Integer] = new util.HashMap[TaskName, java.lang.Integer]() - -def shouldHandleEntry(key: KafkaCheckpointLogKey) = key.isChangelogPartitionMapping - -def handleCheckpoint(payload: ByteBuffer, checkpointKey:KafkaCheckpointLogKey): Unit = { - changelogPartitionMapping = serde.changelogPartitionMappingFromBytes(Utils.readBytes(payload)) - - debug("Adding changelog partition mapping" + changelogPartitionMapping) -} - -readLog(CHANGELOG_PARTITION_MAPPING_LOG4j,
[jira] [Updated] (SAMZA-880) Moving to github/pull-request for code review and check-in
[ https://issues.apache.org/jira/browse/SAMZA-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navina Ramesh updated SAMZA-880: Fix Version/s: (was: 0.11.0) > Moving to github/pull-request for code review and check-in > -- > > Key: SAMZA-880 > URL: https://issues.apache.org/jira/browse/SAMZA-880 > Project: Samza > Issue Type: Task >Reporter: Yi Pan (Data Infrastructure) >Assignee: Jagadish > Attachments: MovingSamzatoPullRequests.pdf, SAMZA-880-1.patch, > SAMZA-880.0.patch > > > As per mailing list > [discussion|http://mail-archives.apache.org/mod_mbox/samza-dev/201602.mbox/%3ccafvexu230gjja7z99lzn6bhnekypjrnof3kyc+dvgko_znq...@mail.gmail.com%3E], > we are going to change the code review and commit process to use github pull > request. This would require a few things: > 1) incorporate kafka-merge-pr.py from [kafka > repo|https://github.com/apache/kafka/blob/trunk/kafka-merge-pr.py] to Samza > that include the merge steps for github pull requests to Apache Samza git repo > 2) update the documentation in contributor corner for the new developer > 3) Document the commiter instructions on using kafka-merge-pr.py. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SAMZA-1023) ContainerProcessManager spams the job coordinator log with "TaskManager state" messages
[ https://issues.apache.org/jira/browse/SAMZA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jake Maes updated SAMZA-1023: - Fix Version/s: 0.11 > ContainerProcessManager spams the job coordinator log with "TaskManager > state" messages > --- > > Key: SAMZA-1023 > URL: https://issues.apache.org/jira/browse/SAMZA-1023 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Jake Maes >Assignee: Branislav Cogic > Fix For: 0.11 > > Attachments: SAMZA-1023_0.patch, SAMZA-1023_1.patch > > > There are 2 problems with the message in the Summary: > 1. The message is printed every second in the job coordinator log. Do we > really want to keep periodically printing that message indefinitely? It would > certainly be a departure from our convention thus far. > 2. Since it starts with "Too many FailedContainers" it gets interpreted as a > problem, some reformatting, reordering, or rewording would help. > {noformat} > 2016-09-21 20:10:40.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:41.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:42.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:43.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:44.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:45.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:46.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:47.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:48.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:49.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:50.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:51.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:52.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:53.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:54.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:55.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:56.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:57.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness:
[jira] [Commented] (SAMZA-1023) ContainerProcessManager spams the job coordinator log with "TaskManager state" messages
[ https://issues.apache.org/jira/browse/SAMZA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532863#comment-15532863 ] Jake Maes commented on SAMZA-1023: -- +1 Merged and committed. Thanks [~banecogic]! > ContainerProcessManager spams the job coordinator log with "TaskManager > state" messages > --- > > Key: SAMZA-1023 > URL: https://issues.apache.org/jira/browse/SAMZA-1023 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Jake Maes >Assignee: Branislav Cogic > Attachments: SAMZA-1023_0.patch, SAMZA-1023_1.patch > > > There are 2 problems with the message in the Summary: > 1. The message is printed every second in the job coordinator log. Do we > really want to keep periodically printing that message indefinitely? It would > certainly be a departure from our convention thus far. > 2. Since it starts with "Too many FailedContainers" it gets interpreted as a > problem, some reformatting, reordering, or rewording would help. > {noformat} > 2016-09-21 20:10:40.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:41.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:42.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:43.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:44.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:45.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:46.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:47.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:48.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:49.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:50.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:51.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:52.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:53.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:54.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:55.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:56.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:57.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured
samza git commit: SAMZA-1023 ContainerProcessManager spams the job coordinator log with 'TaskManager state' messages
Repository: samza Updated Branches: refs/heads/master 623a0a9b2 -> 985822028 SAMZA-1023 ContainerProcessManager spams the job coordinator log with 'TaskManager state' messages Project: http://git-wip-us.apache.org/repos/asf/samza/repo Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/98582202 Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/98582202 Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/98582202 Branch: refs/heads/master Commit: 985822028a71547189ce84faae3d6af1edd4c909 Parents: 623a0a9 Author: Branislav CogicAuthored: Thu Sep 29 06:56:24 2016 -0700 Committer: Jacob Maes Committed: Thu Sep 29 06:56:24 2016 -0700 -- .../org/apache/samza/clustermanager/ContainerProcessManager.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/samza/blob/98582202/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java -- diff --git a/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java b/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java index 1fed2fb..b4309d9 100644 --- a/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java +++ b/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java @@ -150,8 +150,8 @@ public class ContainerProcessManager implements ClusterResourceManager.Callback } public boolean shouldShutdown() { -log.info(" TaskManager state: Too many FailedContainers: {} No. Completed containers: {} Num Configured containers: {}" + - " AllocatorThread liveness: {} ", new Object[]{tooManyFailedContainers, state.completedContainers.get(), state.containerCount, allocatorThread.isAlive()}); +log.debug(" TaskManager state: Completed containers: {}, Configured containers: {}, Is there too many FailedContainers: {}, Is AllocatorThread alive: {} " + , new Object[]{state.completedContainers.get(), state.containerCount, tooManyFailedContainers ? "yes" : "no", allocatorThread.isAlive() ? "yes" : "no"}); if (exceptionOccurred != null) { log.error("Exception in ContainerProcessManager", exceptionOccurred);
[jira] [Created] (SAMZA-1027) InstanceAlreadyExistsException while starting up
Nicolas Colomer created SAMZA-1027: -- Summary: InstanceAlreadyExistsException while starting up Key: SAMZA-1027 URL: https://issues.apache.org/jira/browse/SAMZA-1027 Project: Samza Issue Type: Bug Components: kafka Affects Versions: 0.11 Reporter: Nicolas Colomer After upgrading Samza, I started to see following WARN log while starting a Samza job: {code} 2016-09-29 10:30:09 AppInfoParser [WARN] task[] ssp[] offset[] Error registering AppInfo mbean javax.management.InstanceAlreadyExistsException: kafka.producer:type=app-info,id=samza_producer-my_awesome_job-123 at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:58) at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:328) at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:163) at org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89) at org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89) at org.apache.samza.system.kafka.KafkaSystemProducer.send(KafkaSystemProducer.scala:124) at org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:113) at org.apache.samza.coordinator.stream.AbstractCoordinatorStreamManager.send(AbstractCoordinatorStreamManager.java:72) at org.apache.samza.container.LocalityManager.writeContainerToHostMapping(LocalityManager.java:134) at org.apache.samza.container.SamzaContainer.startLocalityManager(SamzaContainer.scala:739) at org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:651) at org.apache.samza.container.SamzaContainer$.safeMain(SamzaContainer.scala:116) at org.apache.samza.container.SamzaContainer$.main(SamzaContainer.scala:90) at org.apache.samza.container.SamzaContainer.main(SamzaContainer.scala) {code} More precisely, in my situation, this log occurs twice during startup of the job. I figured out that it follows the creation of 2 producers: {{samza_producer}} and {{samza_checkpoint_manager}} with client id respectively equals to {{samza_producer-my_awesome_job-123}} and {{samza_checkpoint_manager-my_awesome_job-123}}. This issue seems to be directly related to SAMZA-981, that remove discriminants timestamp plus unique counter value from the {{client.id}} string. According to KAFKA-3992, this error occurs when multiple producers / consumers are created with the same {{client.id}} setting. Looking at the source code, there is a [lock|https://github.com/ncolomer/samza/blob/17e65d1cbdda1ad436f47c15fa4c86332e229a93/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala#L119-L127] in {{KafkaSystemProducer}} that should prevent any race condition where this could happen. But is there any case where this class (or the {{getProducer}} lambda function) may be instantiated/reused multiple time in the same JVM host? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-981) Set consistent Kafka clientId for a job instance
[ https://issues.apache.org/jira/browse/SAMZA-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532352#comment-15532352 ] Nicolas Colomer commented on SAMZA-981: --- It seems you omitted to remove the {{counter}} val of type {{AtomicLong}} which is now not used anymore in {{KafkaUtil}}. > Set consistent Kafka clientId for a job instance > > > Key: SAMZA-981 > URL: https://issues.apache.org/jira/browse/SAMZA-981 > Project: Samza > Issue Type: Improvement >Affects Versions: 0.11 >Reporter: Xinyu Liu >Assignee: Xinyu Liu > Fix For: 0.11 > > Attachments: SAMZA-981.0.patch > > > Remove the currentTimeMills and counter in the ClientId creation so the > clientId will be the same for each job instance. With this change we can turn > on kafka throttling on the job instance level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SAMZA-1023) ContainerProcessManager spams the job coordinator log with "TaskManager state" messages
[ https://issues.apache.org/jira/browse/SAMZA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532191#comment-15532191 ] Nicolas Colomer commented on SAMZA-1023: Upvote for this one :) > ContainerProcessManager spams the job coordinator log with "TaskManager > state" messages > --- > > Key: SAMZA-1023 > URL: https://issues.apache.org/jira/browse/SAMZA-1023 > Project: Samza > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Jake Maes >Assignee: Branislav Cogic > Attachments: SAMZA-1023_0.patch, SAMZA-1023_1.patch > > > There are 2 problems with the message in the Summary: > 1. The message is printed every second in the job coordinator log. Do we > really want to keep periodically printing that message indefinitely? It would > certainly be a departure from our convention thus far. > 2. Since it starts with "Too many FailedContainers" it gets interpreted as a > problem, some reformatting, reordering, or rewording would help. > {noformat} > 2016-09-21 20:10:40.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:41.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:42.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:43.837 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:44.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:45.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:46.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:47.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:48.838 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:49.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:50.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:51.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:52.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:53.839 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:54.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:55.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:56.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5 AllocatorThread liveness: true > 2016-09-21 20:10:57.840 [main] ContainerProcessManager [INFO] TaskManager > state: Too many FailedContainers: false No. Completed containers: 0 Num > Configured containers: 5