[jira] [Updated] (SAMZA-880) Moving to github/pull-request for code review and check-in

2016-09-29 Thread Navina Ramesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navina Ramesh updated SAMZA-880:

Fix Version/s: 0.11.0

> Moving to github/pull-request for code review and check-in
> --
>
> Key: SAMZA-880
> URL: https://issues.apache.org/jira/browse/SAMZA-880
> Project: Samza
>  Issue Type: Task
>Reporter: Yi Pan (Data Infrastructure)
>Assignee: Jagadish
> Fix For: 0.11.0
>
> Attachments: MovingSamzatoPullRequests.pdf, SAMZA-880-1.patch, 
> SAMZA-880.0.patch
>
>
> As per mailing list 
> [discussion|http://mail-archives.apache.org/mod_mbox/samza-dev/201602.mbox/%3ccafvexu230gjja7z99lzn6bhnekypjrnof3kyc+dvgko_znq...@mail.gmail.com%3E],
>  we are going to change the code review and commit process to use github pull 
> request. This would require a few things:
> 1) incorporate kafka-merge-pr.py from [kafka 
> repo|https://github.com/apache/kafka/blob/trunk/kafka-merge-pr.py] to Samza 
> that include the merge steps for github pull requests to Apache Samza git repo
> 2) update the documentation in contributor corner for the new developer
> 3) Document the commiter instructions on using kafka-merge-pr.py.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-1030) Add documentation for change in the contribution process

2016-09-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534658#comment-15534658
 ] 

ASF GitHub Bot commented on SAMZA-1030:
---

Github user asfgit closed the pull request at:

https://github.com/apache/samza/pull/16


> Add documentation for change in the contribution process
> 
>
> Key: SAMZA-1030
> URL: https://issues.apache.org/jira/browse/SAMZA-1030
> Project: Samza
>  Issue Type: Sub-task
>Reporter: Navina Ramesh
>Assignee: Navina Ramesh
> Fix For: 0.11
>
>
> We have moved away from RB for contributing patches to Apache Samza and want 
> to use Github Pull Request. Need documentation in the website on the new 
> process workflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


samza git commit: SAMZA-1030; Add documentation for change in the contribution process

2016-09-29 Thread jagadish
Repository: samza
Updated Branches:
  refs/heads/master e6956a37b -> 36bcbacc2


SAMZA-1030; Add documentation for change in the contribution process

I re-organized some of the website files and created a "Contributor's Corner" 
that collates all info related to new contributors and committers

Also, updated the merge script with constants to match from the website 
documentation.

Author: Navina Ramesh 

Reviewers: Jagadish 

Closes #16 from navina/adding-docs


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/36bcbacc
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/36bcbacc
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/36bcbacc

Branch: refs/heads/master
Commit: 36bcbacc27faf07621fad44e8edea58eafb94fae
Parents: e6956a3
Author: Navina Ramesh 
Authored: Thu Sep 29 18:09:04 2016 -0700
Committer: vjagadish1989 
Committed: Thu Sep 29 18:09:04 2016 -0700

--
 bin/merge-pull-request.py  |   4 +-
 docs/_layouts/default.html |   4 +-
 docs/contribute/code.md|   6 +-
 docs/contribute/coding-guide.md|   7 ++
 docs/contribute/contributors-corner.md | 116 
 docs/contribute/projects.md|  34 
 docs/contribute/rules.md   |  40 --
 7 files changed, 128 insertions(+), 83 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/bin/merge-pull-request.py
--
diff --git a/bin/merge-pull-request.py b/bin/merge-pull-request.py
index 46c67e4..5f1dae7 100755
--- a/bin/merge-pull-request.py
+++ b/bin/merge-pull-request.py
@@ -51,9 +51,9 @@ PROJECT_NAME = "samza"
 CAPITALIZED_PROJECT_NAME = "samza".upper()
 
 # Remote name which points to the GitHub site
-PR_REMOTE_NAME = os.environ.get("PR_REMOTE_NAME", "apache-github")
+PR_REMOTE_NAME = os.environ.get("PR_REMOTE_NAME", "samza-github")
 # Remote name which points to Apache git
-PUSH_REMOTE_NAME = os.environ.get("PUSH_REMOTE_NAME", "apache")
+PUSH_REMOTE_NAME = os.environ.get("PUSH_REMOTE_NAME", "samza-apache")
 # ASF JIRA username
 JIRA_USERNAME = os.environ.get("JIRA_USERNAME", "")
 # ASF JIRA password

http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/docs/_layouts/default.html
--
diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html
index d20e603..60e56b5 100644
--- a/docs/_layouts/default.html
+++ b/docs/_layouts/default.html
@@ -92,12 +92,10 @@
 
  Contribute
 
-  Rules
+  Contributor's 
Corner
   Coding Guide
-  Projects
   Design 
Documents
   Code
-  https://reviews.apache.org/groups/samza;>Review 
Board
   Tests
 
 

http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/docs/contribute/code.md
--
diff --git a/docs/contribute/code.md b/docs/contribute/code.md
index be85c8d..2ebe354 100644
--- a/docs/contribute/code.md
+++ b/docs/contribute/code.md
@@ -27,8 +27,6 @@ You can check out Samza's code by running:
 git clone http://git-wip-us.apache.org/repos/asf/samza.git
 ```
 
-Please see the [Rules](rules.html) page for information on how to contribute.
+Official releases can be downloaded from [here](/startup/download)
 
-If you are a committer you need to use https instead of http to check in, 
otherwise you will get an error regarding an inability to acquire a lock. Note 
that older versions of git may also give this error even when the repo was 
cloned with https; if you experience this try a newer version of git.
-
-The Samza website is built by Jekyll from the markdown files found in the docs 
subdirectory. For committers wishing to update the webpage, please see 
`docs/README.md` for instructions.
+Please see [Contributor's Corner'](rules.html) for information on how to 
contribute.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/36bcbacc/docs/contribute/coding-guide.md
--
diff --git a/docs/contribute/coding-guide.md b/docs/contribute/coding-guide.md
index 659834f..260c488 100644
--- a/docs/contribute/coding-guide.md
+++ b/docs/contribute/coding-guide.md
@@ -22,6 +22,13 @@ title: Coding Guide
 
 
 These guidelines are meant to encourage consistency and best practices amongst 
people working on the Samza code base. They should be observed unless there is 
a compelling reason to ignore them.
+In general, patches should include:
+
+* Code change
+* Unit tests
+* Javadocs
+* 

[jira] [Commented] (SAMZA-1030) Add documentation for change in the contribution process

2016-09-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534574#comment-15534574
 ] 

ASF GitHub Bot commented on SAMZA-1030:
---

GitHub user navina opened a pull request:

https://github.com/apache/samza/pull/16

SAMZA-1030 : Add documentation for change in the contribution process

I re-organized some of the website files and created a "Contributor's 
Corner" that collates all info related to new contributors and committers 

Also, updated the merge script with constants to match from the website 
documentation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/navina/samza adding-docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/samza/pull/16.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16


commit 3ac500d766475af5eebbf6a8c8b258c94e2666fa
Author: Navina Ramesh 
Date:   2016-09-30T00:28:04Z

Adding doc for contributors page and fixing code in merge script




> Add documentation for change in the contribution process
> 
>
> Key: SAMZA-1030
> URL: https://issues.apache.org/jira/browse/SAMZA-1030
> Project: Samza
>  Issue Type: Sub-task
>Reporter: Navina Ramesh
>Assignee: Navina Ramesh
> Fix For: 0.11.0
>
>
> We have moved away from RB for contributing patches to Apache Samza and want 
> to use Github Pull Request. Need documentation in the website on the new 
> process workflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SAMZA-1030) Add documentation for change in the contribution process

2016-09-29 Thread Navina Ramesh (JIRA)
Navina Ramesh created SAMZA-1030:


 Summary: Add documentation for change in the contribution process
 Key: SAMZA-1030
 URL: https://issues.apache.org/jira/browse/SAMZA-1030
 Project: Samza
  Issue Type: Sub-task
Reporter: Navina Ramesh
Assignee: Navina Ramesh
 Fix For: 0.11.0


We have moved away from RB for contributing patches to Apache Samza and want to 
use Github Pull Request. Need documentation in the website on the new process 
workflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[samza] Git Push Summary

2016-09-29 Thread xinyu
Repository: samza
Updated Tags:  refs/tags/release-0.11.0-rc0 [created] 7271e02af


[jira] [Commented] (SAMZA-1029) Prepare release candidate for 0.11.0

2016-09-29 Thread Xinyu Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534353#comment-15534353
 ] 

Xinyu Liu commented on SAMZA-1029:
--

The list of work items:

- prepare binary artifacts and make the source tar ball available for testing 
(before vote)
- officially upload artifacts to Apache repo (after vote passed)
- update and release documents, following docs/README.md, specifically 
"Release-new-version Website Checklist" section.

After this is done, we can officially update Samza BLOG with the new release.

> Prepare release candidate for 0.11.0
> 
>
> Key: SAMZA-1029
> URL: https://issues.apache.org/jira/browse/SAMZA-1029
> Project: Samza
>  Issue Type: Task
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
> Fix For: 0.11, 0.11.0
>
>
> Upgrade to Samza 0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[samza] Git Push Summary

2016-09-29 Thread xinyu
Repository: samza
Updated Branches:
  refs/heads/0.11.0 [created] e6956a37b


[jira] [Created] (SAMZA-1029) Prepare release candidate for 0.11.0

2016-09-29 Thread Xinyu Liu (JIRA)
Xinyu Liu created SAMZA-1029:


 Summary: Prepare release candidate for 0.11.0
 Key: SAMZA-1029
 URL: https://issues.apache.org/jira/browse/SAMZA-1029
 Project: Samza
  Issue Type: Task
Reporter: Xinyu Liu
Assignee: Xinyu Liu
 Fix For: 0.11, 0.11.0


Upgrade to Samza 0.11.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


samza git commit: Update version for 0.11.0 release

2016-09-29 Thread xinyu
Repository: samza
Updated Branches:
  refs/heads/master 61d35f26c -> e6956a37b


Update version for 0.11.0 release


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/e6956a37
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/e6956a37
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/e6956a37

Branch: refs/heads/master
Commit: e6956a37ba358b45b12df2ceeb6add5834fc8dd6
Parents: 61d35f2
Author: Xinyu Liu 
Authored: Thu Sep 29 15:45:17 2016 -0700
Committer: Xinyu Liu 
Committed: Thu Sep 29 15:45:17 2016 -0700

--
 gradle.properties | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/samza/blob/e6956a37/gradle.properties
--
diff --git a/gradle.properties b/gradle.properties
index 16e1f5d..c94f619 100644
--- a/gradle.properties
+++ b/gradle.properties
@@ -15,7 +15,7 @@
 # specific language governing permissions and limitations
 # under the License.
 group=org.apache.samza
-version=0.10.1-SNAPSHOT
+version=0.11.0
 scalaVersion=2.10
 
 gradleVersion=2.0



[jira] [Commented] (SAMZA-681) Create a unit test harness to easily test samza tasks

2016-09-29 Thread Yi Pan (Data Infrastructure) (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534245#comment-15534245
 ] 

Yi Pan (Data Infrastructure) commented on SAMZA-681:


[~joseangel], assigned to you. Thanks!

> Create a unit test harness to easily test samza tasks
> -
>
> Key: SAMZA-681
> URL: https://issues.apache.org/jira/browse/SAMZA-681
> Project: Samza
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.9.0
>Reporter: Luis De Pombo
>Assignee: Jose Martinez
> Attachments: SAMZA-381.patch
>
>
> We don't have a standardized test framework for unit testing samza tasks. 
> Currently, in order to test the proper functionality of a task we have to 
> bring up the system (kafka) and its dependencies (zk). 
> Goals:
> - The harness should abstract several notions around the system and its 
> partitions, without mocking them, and hold task context and state in memory. 
> - The harness should let you inject several  pairs that would 
> normally be passed in by the `SystemStream` provided in `tasks.input`. 
> - The same harness should offer an interface to visualize all the 
> corresponding  pairs collected in the message `Collector`, which 
> are the output of the `process`ed or `window`ed input  pairs 
> previously injected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SAMZA-681) Create a unit test harness to easily test samza tasks

2016-09-29 Thread Yi Pan (Data Infrastructure) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Pan (Data Infrastructure) updated SAMZA-681:
---
Assignee: Jose Martinez

> Create a unit test harness to easily test samza tasks
> -
>
> Key: SAMZA-681
> URL: https://issues.apache.org/jira/browse/SAMZA-681
> Project: Samza
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.9.0
>Reporter: Luis De Pombo
>Assignee: Jose Martinez
> Attachments: SAMZA-381.patch
>
>
> We don't have a standardized test framework for unit testing samza tasks. 
> Currently, in order to test the proper functionality of a task we have to 
> bring up the system (kafka) and its dependencies (zk). 
> Goals:
> - The harness should abstract several notions around the system and its 
> partitions, without mocking them, and hold task context and state in memory. 
> - The harness should let you inject several  pairs that would 
> normally be passed in by the `SystemStream` provided in `tasks.input`. 
> - The same harness should offer an interface to visualize all the 
> corresponding  pairs collected in the message `Collector`, which 
> are the output of the `process`ed or `window`ed input  pairs 
> previously injected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference

2016-09-29 Thread Jagadish (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadish closed SAMZA-1028.
---

> Moving logline in callback thread and making exception thrown AtomicReference
> -
>
> Key: SAMZA-1028
> URL: https://issues.apache.org/jira/browse/SAMZA-1028
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
> Fix For: 0.11
>
>
> Current the error log happens after produce close and reset the exception in 
> later callbacks, which caused the trouble shooting to be harder in cases of 
> multithreading. We should log error before closing and keep atomic reference 
> of the initial exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference

2016-09-29 Thread Jagadish (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534186#comment-15534186
 ] 

Jagadish commented on SAMZA-1028:
-

Thanks for the patch, Merged and submitted!

> Moving logline in callback thread and making exception thrown AtomicReference
> -
>
> Key: SAMZA-1028
> URL: https://issues.apache.org/jira/browse/SAMZA-1028
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
> Fix For: 0.11
>
>
> Current the error log happens after produce close and reset the exception in 
> later callbacks, which caused the trouble shooting to be harder in cases of 
> multithreading. We should log error before closing and keep atomic reference 
> of the initial exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference

2016-09-29 Thread Jagadish (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadish resolved SAMZA-1028.
-
Resolution: Fixed

> Moving logline in callback thread and making exception thrown AtomicReference
> -
>
> Key: SAMZA-1028
> URL: https://issues.apache.org/jira/browse/SAMZA-1028
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
> Fix For: 0.11
>
>
> Current the error log happens after produce close and reset the exception in 
> later callbacks, which caused the trouble shooting to be harder in cases of 
> multithreading. We should log error before closing and keep atomic reference 
> of the initial exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


samza git commit: SAMZA-1028: Fix KafkaSystemProducer logging and use an AtomicReference for tracking producer exceptions

2016-09-29 Thread jagadish
Repository: samza
Updated Branches:
  refs/heads/master 317b6ff1b -> 61d35f26c


SAMZA-1028: Fix KafkaSystemProducer logging and use an AtomicReference for 
tracking producer exceptions


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/61d35f26
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/61d35f26
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/61d35f26

Branch: refs/heads/master
Commit: 61d35f26c1f924f376b30f60d05c1957364c05e5
Parents: 317b6ff
Author: Xinyu Liu 
Authored: Thu Sep 29 14:51:23 2016 -0700
Committer: vjagadish1989 
Committed: Thu Sep 29 14:52:27 2016 -0700

--
 .../system/kafka/KafkaSystemProducer.scala  | 69 +---
 1 file changed, 47 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/samza/blob/61d35f26/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala
--
diff --git 
a/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala
 
b/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala
index 5ff6d3c..aac53fc 100644
--- 
a/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala
+++ 
b/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala
@@ -20,8 +20,8 @@
 package org.apache.samza.system.kafka
 
 
-import java.util.concurrent.ConcurrentHashMap
-import java.util.concurrent.Future
+import java.util.concurrent.atomic.AtomicReference
+import java.util.concurrent.{TimeUnit, ConcurrentHashMap, Future}
 
 import org.apache.kafka.clients.producer.Callback
 import org.apache.kafka.clients.producer.Producer
@@ -56,12 +56,13 @@ class KafkaSystemProducer(systemName: String,
 @volatile
 var latestFuture: Future[RecordMetadata] = null
 /**
- * exceptionThrown: to store the exception in case of any "ultimate" send 
failure (ie. failure
+ * exceptionInCallback: to store the exception in case of any "ultimate" 
send failure (ie. failure
  * after exhausting max_retries in Kafka producer) in the I/O thread, we 
do not continue to queue up more send
  * requests from the samza thread. It helps the samza thread identify if 
the failure happened in I/O thread or not.
+ *
+ * In cases of multiple exceptions in the callbacks, we keep the first one 
before throwing.
  */
-@volatile
-var exceptionThrown: SamzaException = null
+var exceptionInCallback: AtomicReference[SamzaException] = new 
AtomicReference[SamzaException]()
   }
 
   @volatile var producer: Producer[Array[Byte], Array[Byte]] = null
@@ -80,13 +81,13 @@ class KafkaSystemProducer(systemName: String,
   producer = null
 
   sources.foreach {p =>
-if (p._2.exceptionThrown == null) {
+if (p._2.exceptionInCallback.get() == null) {
   flush(p._1)
 }
   }
 }
   } catch {
-case e: Exception => logger.error(e.getMessage, e)
+case e: Exception => error(e.getMessage, e)
   }
 }
   }
@@ -97,6 +98,21 @@ class KafkaSystemProducer(systemName: String,
 }
   }
 
+  def closeAndNullifyCurrentProducer(currentProducer: Producer[Array[Byte], 
Array[Byte]]) {
+try {
+  // TODO: we should use timeout close() to make sure we fail all waiting 
messages in kafka 0.9+
+  currentProducer.close()
+} catch {
+  case e: Exception => error("producer close failed", e)
+}
+producerLock.synchronized {
+  if (currentProducer == producer) {
+// only nullify the member producer if it is still the same object, no 
point nullifying new producer
+producer = null
+  }
+}
+  }
+
   def send(source: String, envelope: OutgoingMessageEnvelope) {
 trace("Enqueuing message: %s, %s." format (source, envelope))
 
@@ -110,10 +126,10 @@ class KafkaSystemProducer(systemName: String,
   throw new IllegalArgumentException("Source %s must be registered first 
before send." format source)
 }
 
-val exception = sourceData.exceptionThrown
+val exception = sourceData.exceptionInCallback.getAndSet(null)
 if (exception != null) {
   metrics.sendFailed.inc
-  throw exception
+  throw exception  // in case the caller catches all exceptions and will 
try again
 }
 
 // lazy initialization of the producer
@@ -134,8 +150,7 @@ class KafkaSystemProducer(systemName: String,
 // Java-based Kafka producer API requires an "Integer" type partitionKey 
and does not allow custom overriding of Partitioners
 // Any kind of custom partitioning has to be done on the client-side
 val partitions: java.util.List[PartitionInfo] = 

[jira] [Commented] (SAMZA-1027) InstanceAlreadyExistsException while starting up

2016-09-29 Thread Xinyu Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534154#comment-15534154
 ] 

Xinyu Liu commented on SAMZA-1027:
--

[~ncolomer]: could you please check whether you have producers for multiple 
kafka systems? In SAMZA-981, we changed the way to generate clientId for kafka 
consumers/producers to be based on just job name and job instance in order to 
support kafka quotas 
(http://kafka.apache.org/documentation.html#design_quotas). But it revealed a 
bug in kafka that client id is also used to register the metrics mbeans. So if 
you have kafka producers for multiple systems, we will create multiple 
producers sharing the same clientId. This caused the mbeans naming collision. 
According to kafka, the only side effect is that some of the kafka metrics 
won't be reported through mbeans. Does that affect you? We are working with 
Kafka to get a fix for it in upcoming kafka version.

> InstanceAlreadyExistsException while starting up
> 
>
> Key: SAMZA-1027
> URL: https://issues.apache.org/jira/browse/SAMZA-1027
> Project: Samza
>  Issue Type: Bug
>  Components: kafka
>Affects Versions: 0.11
>Reporter: Nicolas Colomer
>
> After upgrading Samza, I started to see following WARN log while starting a 
> Samza job:
> {code}
> 2016-09-29 10:30:09 AppInfoParser [WARN] task[] ssp[] offset[] Error 
> registering AppInfo mbean
> javax.management.InstanceAlreadyExistsException: 
> kafka.producer:type=app-info,id=samza_producer-my_awesome_job-123
> at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
> at 
> org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:58)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:328)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:163)
> at 
> org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89)
> at 
> org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89)
> at 
> org.apache.samza.system.kafka.KafkaSystemProducer.send(KafkaSystemProducer.scala:124)
> at 
> org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:113)
> at 
> org.apache.samza.coordinator.stream.AbstractCoordinatorStreamManager.send(AbstractCoordinatorStreamManager.java:72)
> at 
> org.apache.samza.container.LocalityManager.writeContainerToHostMapping(LocalityManager.java:134)
> at 
> org.apache.samza.container.SamzaContainer.startLocalityManager(SamzaContainer.scala:739)
> at 
> org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:651)
> at 
> org.apache.samza.container.SamzaContainer$.safeMain(SamzaContainer.scala:116)
> at 
> org.apache.samza.container.SamzaContainer$.main(SamzaContainer.scala:90)
> at 
> org.apache.samza.container.SamzaContainer.main(SamzaContainer.scala)
> {code}
> More precisely, in my situation, this log occurs twice during startup of the 
> job.
> I figured out that it follows the creation of 2 producers: {{samza_producer}} 
> and {{samza_checkpoint_manager}} with client id respectively equals to 
> {{samza_producer-my_awesome_job-123}} and 
> {{samza_checkpoint_manager-my_awesome_job-123}}.
> This issue seems to be directly related to SAMZA-981, that remove 
> discriminants timestamp plus unique counter value from the {{client.id}} 
> string.
> According to KAFKA-3992, this error occurs when multiple producers / 
> consumers are created with the same {{client.id}} setting.
> Looking at the source code, there is a 
> [lock|https://github.com/ncolomer/samza/blob/17e65d1cbdda1ad436f47c15fa4c86332e229a93/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala#L119-L127]
>  in {{KafkaSystemProducer}} that should prevent any race condition where this 
> could happen. But is there any case where this class (or the {{getProducer}} 
> lambda function) may be instantiated/reused multiple time in the same JVM 
> host?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-995) Remove configs deprecated in 0.10.* release and update configs

2016-09-29 Thread Jagadish (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533729#comment-15533729
 ] 

Jagadish commented on SAMZA-995:


Thanks Xinyu for the patch, merged and submitted.

> Remove configs deprecated in 0.10.* release and update configs
> --
>
> Key: SAMZA-995
> URL: https://issues.apache.org/jira/browse/SAMZA-995
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Navina Ramesh
>Assignee: Xinyu Liu
> Fix For: 0.11.0
>
>
> "yarn.container.count" was deprecated in 0.10.0 and we replaced it with 
> "job.container.count". We should remove the warning message, remove 
> deprecated config and update configurations tables as well. 
> Please also scan for more such variables (if any). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (SAMZA-1020) Remove some deprecated interfaces from 0.10 version

2016-09-29 Thread Jagadish (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadish resolved SAMZA-1020.
-
Resolution: Fixed

> Remove some deprecated interfaces from 0.10 version
> ---
>
> Key: SAMZA-1020
> URL: https://issues.apache.org/jira/browse/SAMZA-1020
> Project: Samza
>  Issue Type: Bug
>Reporter: Navina Ramesh
>Assignee: Xinyu Liu
>  Labels: newbie
> Fix For: 0.11.0
>
>
> Some interfaces/methods were deprecated as a part of 0.10.0 release. They 
> were marked as @Deprecated . However, we yet to remove them.
> Some instances I noticed were in these classes:
> * readChangeLogPartitionMapping in KafkaCheckpointManager
> * isChangelogPartitionMapping, getChangelogPartitionMappingKey in 
> KafkaCheckpointLogKey ; Probably a few more in this class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SAMZA-1020) Remove some deprecated interfaces from 0.10 version

2016-09-29 Thread Jagadish (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadish updated SAMZA-1020:

Assignee: Xinyu Liu

> Remove some deprecated interfaces from 0.10 version
> ---
>
> Key: SAMZA-1020
> URL: https://issues.apache.org/jira/browse/SAMZA-1020
> Project: Samza
>  Issue Type: Bug
>Reporter: Navina Ramesh
>Assignee: Xinyu Liu
>  Labels: newbie
> Fix For: 0.11.0
>
>
> Some interfaces/methods were deprecated as a part of 0.10.0 release. They 
> were marked as @Deprecated . However, we yet to remove them.
> Some instances I noticed were in these classes:
> * readChangeLogPartitionMapping in KafkaCheckpointManager
> * isChangelogPartitionMapping, getChangelogPartitionMappingKey in 
> KafkaCheckpointLogKey ; Probably a few more in this class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-1020) Remove some deprecated interfaces from 0.10 version

2016-09-29 Thread Jagadish (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533738#comment-15533738
 ] 

Jagadish commented on SAMZA-1020:
-

Thanks [~xinyu] for the patch. Merged, and submitted. Thanks!

> Remove some deprecated interfaces from 0.10 version
> ---
>
> Key: SAMZA-1020
> URL: https://issues.apache.org/jira/browse/SAMZA-1020
> Project: Samza
>  Issue Type: Bug
>Reporter: Navina Ramesh
>  Labels: newbie
> Fix For: 0.11.0
>
>
> Some interfaces/methods were deprecated as a part of 0.10.0 release. They 
> were marked as @Deprecated . However, we yet to remove them.
> Some instances I noticed were in these classes:
> * readChangeLogPartitionMapping in KafkaCheckpointManager
> * isChangelogPartitionMapping, getChangelogPartitionMappingKey in 
> KafkaCheckpointLogKey ; Probably a few more in this class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference

2016-09-29 Thread Xinyu Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533732#comment-15533732
 ] 

Xinyu Liu commented on SAMZA-1028:
--

rb: https://reviews.apache.org/r/52403/

> Moving logline in callback thread and making exception thrown AtomicReference
> -
>
> Key: SAMZA-1028
> URL: https://issues.apache.org/jira/browse/SAMZA-1028
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
> Fix For: 0.11
>
>
> Current the error log happens after produce close and reset the exception in 
> later callbacks, which caused the trouble shooting to be harder in cases of 
> multithreading. We should log error before closing and keep atomic reference 
> of the initial exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-995) Remove configs deprecated in 0.10.* release and update configs

2016-09-29 Thread Jagadish (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533726#comment-15533726
 ] 

Jagadish commented on SAMZA-995:


RB: https://reviews.apache.org/r/52401/ .

> Remove configs deprecated in 0.10.* release and update configs
> --
>
> Key: SAMZA-995
> URL: https://issues.apache.org/jira/browse/SAMZA-995
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Navina Ramesh
>Assignee: Xinyu Liu
> Fix For: 0.11.0
>
>
> "yarn.container.count" was deprecated in 0.10.0 and we replaced it with 
> "job.container.count". We should remove the warning message, remove 
> deprecated config and update configurations tables as well. 
> Please also scan for more such variables (if any). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SAMZA-1028) Moving logline in callback thread and making exception thrown AtomicReference

2016-09-29 Thread Xinyu Liu (JIRA)
Xinyu Liu created SAMZA-1028:


 Summary: Moving logline in callback thread and making exception 
thrown AtomicReference
 Key: SAMZA-1028
 URL: https://issues.apache.org/jira/browse/SAMZA-1028
 Project: Samza
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Xinyu Liu
Assignee: Xinyu Liu
 Fix For: 0.11


Current the error log happens after produce close and reset the exception in 
later callbacks, which caused the trouble shooting to be harder in cases of 
multithreading. We should log error before closing and keep atomic reference of 
the initial exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] samza git commit: SAMZA-995: Remove deprecated property yarn.container.count from config table.

2016-09-29 Thread jagadish
SAMZA-995: Remove deprecated property yarn.container.count from config table.


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/317b6ff1
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/317b6ff1
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/317b6ff1

Branch: refs/heads/master
Commit: 317b6ff1be8b497cd1468fddf3975b2832cf9d74
Parents: db31e89
Author: Xinyu Liu 
Authored: Thu Sep 29 11:50:42 2016 -0700
Committer: vjagadish1989 
Committed: Thu Sep 29 11:54:49 2016 -0700

--
 docs/contribute/coding-guide.md | 2 +-
 .../documentation/versioned/container/coordinator-stream.md | 2 +-
 .../documentation/versioned/jobs/configuration-table.html   | 9 -
 docs/learn/documentation/versioned/jobs/reprocessing.md | 2 +-
 docs/learn/documentation/versioned/jobs/yarn-jobs.md| 2 +-
 docs/learn/tutorials/versioned/remote-debugging-samza.md| 2 +-
 6 files changed, 5 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/contribute/coding-guide.md
--
diff --git a/docs/contribute/coding-guide.md b/docs/contribute/coding-guide.md
index 6cb4cee..659834f 100644
--- a/docs/contribute/coding-guide.md
+++ b/docs/contribute/coding-guide.md
@@ -91,7 +91,7 @@ We are following the style guide given here (though not 
perfectly). Below are so
 * All configuration names that define a class must end with .class (e.g. 
task.command.class).
 * All configuration names that define a factory class must end with 
.factory.class (e.g. systems.kafka.consumer.factory.class).
 * Configuration will always be defined as simple key/value pairs (e.g. a=b).
-* When configuration is related, it must be grouped using the same prefix 
(e.g. yarn.container.count=1, yarn.container.memory.bytes=1073741824).
+* When configuration is related, it must be grouped using the same prefix 
(e.g. job.container.count=1, yarn.container.memory.bytes=1073741824).
 * When configuration must be defined multiple times, the key should be 
parameterized (e.g. systems.kafka.consumer.factory=x, 
systems.kestrel.consumer.factory=y). *When such configuration must be referred 
to, its parameter should be used (e.g. foo.bar.system=kafka, 
foo.bar.system=kestrel).
 * All getter methods must be a camel case match with their configuration names 
(e.g. yarn.package.uri and getYarnPackageUri).
 * Reading configuration should only be done in factories and main methods. 
Don't pass Config objects around.

http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/learn/documentation/versioned/container/coordinator-stream.md
--
diff --git a/docs/learn/documentation/versioned/container/coordinator-stream.md 
b/docs/learn/documentation/versioned/container/coordinator-stream.md
index b680593..6b14960 100644
--- a/docs/learn/documentation/versioned/container/coordinator-stream.md
+++ b/docs/learn/documentation/versioned/container/coordinator-stream.md
@@ -116,7 +116,7 @@ Samza provides a command line tool to write Job 
Configuration messages to the co
 samza-example/target/bin/run-coordinator-stream-writer.sh \
   --config-path=file:///path/to/job/config.properties \
   --type set-config \
-  --key yarn.container.count \
+  --key job.container.count \
   --value 8
 {% endhighlight %}
 

http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/learn/documentation/versioned/jobs/configuration-table.html
--
diff --git a/docs/learn/documentation/versioned/jobs/configuration-table.html 
b/docs/learn/documentation/versioned/jobs/configuration-table.html
index 1d16f52..f60cd50 100644
--- a/docs/learn/documentation/versioned/jobs/configuration-table.html
+++ b/docs/learn/documentation/versioned/jobs/configuration-table.html
@@ -1524,15 +1524,6 @@
 
 
 
-yarn.container.count
-1
-
-This is deprecated in favor of
-job.container.count
-
-
-
-
 yarn.container.memory.mb
 1024
 

http://git-wip-us.apache.org/repos/asf/samza/blob/317b6ff1/docs/learn/documentation/versioned/jobs/reprocessing.md
--
diff --git a/docs/learn/documentation/versioned/jobs/reprocessing.md 
b/docs/learn/documentation/versioned/jobs/reprocessing.md
index 174b46f..97d8801 100644
--- a/docs/learn/documentation/versioned/jobs/reprocessing.md
+++ 

[1/2] samza git commit: SAMZA-1020: Remove methods in Kafka checkpoint classes that were deprecated in the 0.10 release.

2016-09-29 Thread jagadish
Repository: samza
Updated Branches:
  refs/heads/master 985822028 -> 317b6ff1b


SAMZA-1020: Remove methods in Kafka checkpoint classes that were deprecated in 
the 0.10 release.


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/db31e899
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/db31e899
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/db31e899

Branch: refs/heads/master
Commit: db31e8992c7bcbe17a5ad85bbab6105e232a626a
Parents: 9858220
Author: vjagadish1989 
Authored: Thu Sep 29 11:48:22 2016 -0700
Committer: vjagadish1989 
Committed: Thu Sep 29 11:48:22 2016 -0700

--
 .../kafka/KafkaCheckpointLogKey.scala   | 23 
 .../kafka/KafkaCheckpointManager.scala  | 22 ---
 .../kafka/TeskKafkaCheckpointLogKey.scala   | 10 -
 3 files changed, 55 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/samza/blob/db31e899/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala
--
diff --git 
a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala
 
b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala
index ea8462d..9ed64c3 100644
--- 
a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala
+++ 
b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointLogKey.scala
@@ -57,13 +57,6 @@ class KafkaCheckpointLogKey private (val map: Map[String, 
String]) {
*/
   def isCheckpointKey = getKey.equals(CHECKPOINT_KEY_TYPE)
 
-  /**
-   * Is this key for a changelog partition mapping?
-   *
-   * @return true iff this key's entry is for a changelog partition mapping
-   */
-  @Deprecated
-  def isChangelogPartitionMapping = getKey.equals(CHANGELOG_PARTITION_KEY_TYPE)
 
   /**
* If this Key is for a checkpoint entry, return its associated TaskName.
@@ -98,17 +91,9 @@ object KafkaCheckpointLogKey {
   val CHECKPOINT_KEY_KEY = "type"
   val CHECKPOINT_KEY_TYPE = "checkpoint"
 
-  @Deprecated
-  val CHANGELOG_PARTITION_KEY_TYPE = "changelog-partition-mapping"
-
   val CHECKPOINT_TASKNAME_KEY = "taskName"
   val SYSTEMSTREAMPARTITION_GROUPER_FACTORY_KEY = 
"systemstreampartition-grouper-factory"
 
-  /**
-   * Partition mapping keys have no dynamic values, so we just need one 
instance.
-   */
-  @Deprecated
-  val CHANGELOG_PARTITION_MAPPING_KEY = new 
KafkaCheckpointLogKey(Map(CHECKPOINT_KEY_KEY -> CHANGELOG_PARTITION_KEY_TYPE))
 
   private val JSON_MAPPER = new ObjectMapper()
   val KEY_TYPEREFERENCE = new TypeReference[util.HashMap[String, String]]() {}
@@ -146,14 +131,6 @@ object KafkaCheckpointLogKey {
   }
 
   /**
-   * Build a key for a changelog partition mapping entry
-   *
-   * @return Key for changelog partition mapping entry
-   */
-  @Deprecated
-  def getChangelogPartitionMappingKey() = CHANGELOG_PARTITION_MAPPING_KEY
-
-  /**
* Deserialize a Kafka checkpoint log key
* @param bytes Serialized (via JSON) Kafka checkpoint log key
* @return Checkpoint log key

http://git-wip-us.apache.org/repos/asf/samza/blob/db31e899/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala
--
diff --git 
a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala
 
b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala
index ea10cae..8f18a92 100644
--- 
a/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala
+++ 
b/samza-kafka/src/main/scala/org/apache/samza/checkpoint/kafka/KafkaCheckpointManager.scala
@@ -290,28 +290,6 @@ class KafkaCheckpointManager(
 }
   }
 
-
-  /**
-   * Read through entire log, discarding checkpoints, finding latest 
changelogPartitionMapping
-   * To be used for Migration purpose only. In newer version, 
changelogPartitionMapping will be handled through coordinator stream
-   */
-  @Deprecated
-  def readChangeLogPartitionMapping(): util.Map[TaskName, java.lang.Integer] = 
{
-var changelogPartitionMapping: util.Map[TaskName, java.lang.Integer] = new 
util.HashMap[TaskName, java.lang.Integer]()
-
-def shouldHandleEntry(key: KafkaCheckpointLogKey) = 
key.isChangelogPartitionMapping
-
-def handleCheckpoint(payload: ByteBuffer, 
checkpointKey:KafkaCheckpointLogKey): Unit = {
-  changelogPartitionMapping = 
serde.changelogPartitionMappingFromBytes(Utils.readBytes(payload))
-
-  debug("Adding changelog partition mapping" + changelogPartitionMapping)
-}
-
-readLog(CHANGELOG_PARTITION_MAPPING_LOG4j, 

[jira] [Updated] (SAMZA-880) Moving to github/pull-request for code review and check-in

2016-09-29 Thread Navina Ramesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navina Ramesh updated SAMZA-880:

Fix Version/s: (was: 0.11.0)

> Moving to github/pull-request for code review and check-in
> --
>
> Key: SAMZA-880
> URL: https://issues.apache.org/jira/browse/SAMZA-880
> Project: Samza
>  Issue Type: Task
>Reporter: Yi Pan (Data Infrastructure)
>Assignee: Jagadish
> Attachments: MovingSamzatoPullRequests.pdf, SAMZA-880-1.patch, 
> SAMZA-880.0.patch
>
>
> As per mailing list 
> [discussion|http://mail-archives.apache.org/mod_mbox/samza-dev/201602.mbox/%3ccafvexu230gjja7z99lzn6bhnekypjrnof3kyc+dvgko_znq...@mail.gmail.com%3E],
>  we are going to change the code review and commit process to use github pull 
> request. This would require a few things:
> 1) incorporate kafka-merge-pr.py from [kafka 
> repo|https://github.com/apache/kafka/blob/trunk/kafka-merge-pr.py] to Samza 
> that include the merge steps for github pull requests to Apache Samza git repo
> 2) update the documentation in contributor corner for the new developer
> 3) Document the commiter instructions on using kafka-merge-pr.py.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SAMZA-1023) ContainerProcessManager spams the job coordinator log with "TaskManager state" messages

2016-09-29 Thread Jake Maes (JIRA)

 [ 
https://issues.apache.org/jira/browse/SAMZA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Maes updated SAMZA-1023:
-
Fix Version/s: 0.11

> ContainerProcessManager spams the job coordinator log with "TaskManager 
> state" messages
> ---
>
> Key: SAMZA-1023
> URL: https://issues.apache.org/jira/browse/SAMZA-1023
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Jake Maes
>Assignee: Branislav Cogic
> Fix For: 0.11
>
> Attachments: SAMZA-1023_0.patch, SAMZA-1023_1.patch
>
>
> There are 2 problems with the message in the Summary:
> 1. The message is printed every second in the job coordinator log. Do we 
> really want to keep periodically printing that message indefinitely? It would 
> certainly be a departure from our convention thus far.
> 2. Since it starts with "Too many FailedContainers" it gets interpreted as a 
> problem, some reformatting, reordering, or rewording would help. 
> {noformat}
> 2016-09-21 20:10:40.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:41.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:42.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:43.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:44.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:45.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:46.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:47.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:48.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:49.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:50.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:51.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:52.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:53.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:54.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:55.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:56.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:57.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: 

[jira] [Commented] (SAMZA-1023) ContainerProcessManager spams the job coordinator log with "TaskManager state" messages

2016-09-29 Thread Jake Maes (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532863#comment-15532863
 ] 

Jake Maes commented on SAMZA-1023:
--

+1 
Merged and committed. 

Thanks [~banecogic]!

> ContainerProcessManager spams the job coordinator log with "TaskManager 
> state" messages
> ---
>
> Key: SAMZA-1023
> URL: https://issues.apache.org/jira/browse/SAMZA-1023
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Jake Maes
>Assignee: Branislav Cogic
> Attachments: SAMZA-1023_0.patch, SAMZA-1023_1.patch
>
>
> There are 2 problems with the message in the Summary:
> 1. The message is printed every second in the job coordinator log. Do we 
> really want to keep periodically printing that message indefinitely? It would 
> certainly be a departure from our convention thus far.
> 2. Since it starts with "Too many FailedContainers" it gets interpreted as a 
> problem, some reformatting, reordering, or rewording would help. 
> {noformat}
> 2016-09-21 20:10:40.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:41.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:42.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:43.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:44.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:45.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:46.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:47.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:48.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:49.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:50.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:51.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:52.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:53.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:54.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:55.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:56.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:57.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured 

samza git commit: SAMZA-1023 ContainerProcessManager spams the job coordinator log with 'TaskManager state' messages

2016-09-29 Thread jmakes
Repository: samza
Updated Branches:
  refs/heads/master 623a0a9b2 -> 985822028


SAMZA-1023 ContainerProcessManager spams the job coordinator log with 
'TaskManager state' messages


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/98582202
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/98582202
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/98582202

Branch: refs/heads/master
Commit: 985822028a71547189ce84faae3d6af1edd4c909
Parents: 623a0a9
Author: Branislav Cogic 
Authored: Thu Sep 29 06:56:24 2016 -0700
Committer: Jacob Maes 
Committed: Thu Sep 29 06:56:24 2016 -0700

--
 .../org/apache/samza/clustermanager/ContainerProcessManager.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/samza/blob/98582202/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
--
diff --git 
a/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
 
b/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
index 1fed2fb..b4309d9 100644
--- 
a/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
+++ 
b/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
@@ -150,8 +150,8 @@ public class ContainerProcessManager implements 
ClusterResourceManager.Callback
   }
 
   public boolean shouldShutdown() {
-log.info(" TaskManager state: Too many FailedContainers: {} No. Completed 
containers: {} Num Configured containers: {}" +
-  " AllocatorThread liveness: {} ", new Object[]{tooManyFailedContainers, 
state.completedContainers.get(), state.containerCount, 
allocatorThread.isAlive()});
+log.debug(" TaskManager state: Completed containers: {}, Configured 
containers: {}, Is there too many FailedContainers: {}, Is AllocatorThread 
alive: {} "
+  , new Object[]{state.completedContainers.get(), state.containerCount, 
tooManyFailedContainers ? "yes" : "no", allocatorThread.isAlive() ? "yes" : 
"no"});
 
 if (exceptionOccurred != null) {
   log.error("Exception in ContainerProcessManager", exceptionOccurred);



[jira] [Created] (SAMZA-1027) InstanceAlreadyExistsException while starting up

2016-09-29 Thread Nicolas Colomer (JIRA)
Nicolas Colomer created SAMZA-1027:
--

 Summary: InstanceAlreadyExistsException while starting up
 Key: SAMZA-1027
 URL: https://issues.apache.org/jira/browse/SAMZA-1027
 Project: Samza
  Issue Type: Bug
  Components: kafka
Affects Versions: 0.11
Reporter: Nicolas Colomer


After upgrading Samza, I started to see following WARN log while starting a 
Samza job:

{code}
2016-09-29 10:30:09 AppInfoParser [WARN] task[] ssp[] offset[] Error 
registering AppInfo mbean
javax.management.InstanceAlreadyExistsException: 
kafka.producer:type=app-info,id=samza_producer-my_awesome_job-123
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:58)
at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:328)
at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:163)
at 
org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89)
at 
org.apache.samza.system.kafka.KafkaSystemFactory$$anonfun$3.apply(KafkaSystemFactory.scala:89)
at 
org.apache.samza.system.kafka.KafkaSystemProducer.send(KafkaSystemProducer.scala:124)
at 
org.apache.samza.coordinator.stream.CoordinatorStreamSystemProducer.send(CoordinatorStreamSystemProducer.java:113)
at 
org.apache.samza.coordinator.stream.AbstractCoordinatorStreamManager.send(AbstractCoordinatorStreamManager.java:72)
at 
org.apache.samza.container.LocalityManager.writeContainerToHostMapping(LocalityManager.java:134)
at 
org.apache.samza.container.SamzaContainer.startLocalityManager(SamzaContainer.scala:739)
at 
org.apache.samza.container.SamzaContainer.run(SamzaContainer.scala:651)
at 
org.apache.samza.container.SamzaContainer$.safeMain(SamzaContainer.scala:116)
at 
org.apache.samza.container.SamzaContainer$.main(SamzaContainer.scala:90)
at org.apache.samza.container.SamzaContainer.main(SamzaContainer.scala)
{code}

More precisely, in my situation, this log occurs twice during startup of the 
job.

I figured out that it follows the creation of 2 producers: {{samza_producer}} 
and {{samza_checkpoint_manager}} with client id respectively equals to 
{{samza_producer-my_awesome_job-123}} and 
{{samza_checkpoint_manager-my_awesome_job-123}}.

This issue seems to be directly related to SAMZA-981, that remove discriminants 
timestamp plus unique counter value from the {{client.id}} string.

According to KAFKA-3992, this error occurs when multiple producers / consumers 
are created with the same {{client.id}} setting.

Looking at the source code, there is a 
[lock|https://github.com/ncolomer/samza/blob/17e65d1cbdda1ad436f47c15fa4c86332e229a93/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala#L119-L127]
 in {{KafkaSystemProducer}} that should prevent any race condition where this 
could happen. But is there any case where this class (or the {{getProducer}} 
lambda function) may be instantiated/reused multiple time in the same JVM host?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-981) Set consistent Kafka clientId for a job instance

2016-09-29 Thread Nicolas Colomer (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532352#comment-15532352
 ] 

Nicolas Colomer commented on SAMZA-981:
---

It seems you omitted to remove the {{counter}} val of type {{AtomicLong}} which 
is now not used anymore in {{KafkaUtil}}.

> Set consistent Kafka clientId for a job instance
> 
>
> Key: SAMZA-981
> URL: https://issues.apache.org/jira/browse/SAMZA-981
> Project: Samza
>  Issue Type: Improvement
>Affects Versions: 0.11
>Reporter: Xinyu Liu
>Assignee: Xinyu Liu
> Fix For: 0.11
>
> Attachments: SAMZA-981.0.patch
>
>
> Remove the currentTimeMills and counter in the ClientId creation so the 
> clientId will be the same for each job instance. With this change we can turn 
> on kafka throttling on the job instance level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SAMZA-1023) ContainerProcessManager spams the job coordinator log with "TaskManager state" messages

2016-09-29 Thread Nicolas Colomer (JIRA)

[ 
https://issues.apache.org/jira/browse/SAMZA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532191#comment-15532191
 ] 

Nicolas Colomer commented on SAMZA-1023:


Upvote for this one :)

> ContainerProcessManager spams the job coordinator log with "TaskManager 
> state" messages
> ---
>
> Key: SAMZA-1023
> URL: https://issues.apache.org/jira/browse/SAMZA-1023
> Project: Samza
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Jake Maes
>Assignee: Branislav Cogic
> Attachments: SAMZA-1023_0.patch, SAMZA-1023_1.patch
>
>
> There are 2 problems with the message in the Summary:
> 1. The message is printed every second in the job coordinator log. Do we 
> really want to keep periodically printing that message indefinitely? It would 
> certainly be a departure from our convention thus far.
> 2. Since it starts with "Too many FailedContainers" it gets interpreted as a 
> problem, some reformatting, reordering, or rewording would help. 
> {noformat}
> 2016-09-21 20:10:40.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:41.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:42.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:43.837 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:44.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:45.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:46.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:47.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:48.838 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:49.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:50.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:51.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:52.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:53.839 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:54.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:55.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:56.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5 AllocatorThread liveness: true 
> 2016-09-21 20:10:57.840 [main] ContainerProcessManager [INFO]  TaskManager 
> state: Too many FailedContainers: false No. Completed containers: 0 Num 
> Configured containers: 5