[GitHub] samza pull request #98: SAMZA-1171: Rewrite config in ApplicationRunnerMain ...
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/98 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] samza pull request #98: SAMZA-1171: Rewrite config in ApplicationRunnerMain ...
GitHub user xinyuiscool opened a pull request: https://github.com/apache/samza/pull/98 SAMZA-1171: Rewrite config in ApplicationRunnerMain when creating ApplicationRunner The config needs to be rewritten before passing down to the ApplicationRunner. This is a bug that was introduced during some refactoring/cleanup of the config in the ApplicationRunner interface. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xinyuiscool/samza SAMZA-1171 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/98.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #98 commit 8c29ba454a1a5cbb12f9535ea7271438cc5540ca Author: Xinyu Liu Date: 2017-03-27T23:21:01Z SAMZA-1171: Rewrite config in ApplicationRunnerMain when creating the application runner --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] samza pull request #97: SAMZA-1143 Include fs..impl.* subkeys to Yar...
GitHub user fredji97 opened a pull request: https://github.com/apache/samza/pull/97 SAMZA-1143 Include fs..impl.* subkeys to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager SAMZA-1143 Include fs..impl.* subkeys, in addition to fs..impl, to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager. When there are additional subconfigurations under fs.myScheme.impl, such as fs.myScheme.impl.client, we need to keep the set of configuration completed in YarnJobFactory and YarnClusterResourceManager. When the context is set for localizing the resource in ClientHelper and YarnContainerRunner, it may rely on this configuration to get the FileStatus information, which may or may not depends on fs..impl and all possible fs..impl.* sub-configuration. This is an enhanced feature to the PR#90. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fredji97/samza fsImplSubkeys Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/97.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #97 commit 988d43573656c7d795b251c824c7fad3c6f5d9f3 Author: Fred Ji Date: 2017-03-27T22:19:26Z SAMZA-1143 Copy fs..impl.* subkeys, in addition to fs..impl, to YarnConfiguration used in YarnJobFactory and YarnClusterResourceManager. When there are additional subconfigurations under fs..impl, such as fs..impl.client, we need to keep the set of configuration completed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Steps to Upgrading Samza (0.9 to 0.12)
Hi colleagues, If I understand samza source code correctly, without migration code we would not lose offsets but lose TaskName-to-ChangelogPartition mapping. State restore for TaskStorage starts from ChangelogSystemStreamPartition beginning (if we don't use data locality, offsets to start from are stored locally in the file then) https://github.com/apache/samza/blob/0.12.0/samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala#L161-L161 . Hence if we lose TaskName-to-ChangelogPartition mapping, or in case of migration if we migrate from Samza 0.9 to Samza 0.11 or 0.12 without intermediate migration to Samza 0.10, and if we still have data in our ChangelogSystemStream, Samza will recreate TaskName-to-ChangelogPartition mapping and restore state from newly selected ChangelogSystemStreamPartition. Samza 0.12 sort collection by TaskName during re-creating of this mapping ( https://github.com/apache/samza/blob/0.12.0/samza-core/src/main/scala/org/apache/samza/coordinator/JobModelManager.scala#L259-L259), but Samza 0.9 does not ( https://github.com/apache/samza/blob/0.9.1/samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala#L142). Hence, in case of migration from Samza 0.9, we can end up with the wrong state restored for TaskStorages, because there is no guarantee for iteration order in Map. Please, correct me if I'm wrong. Best regards, Maxim Logvinenko On 27 March 2017 at 20:58:24, Navina Ramesh (Apache) (nav...@apache.org) wrote: @Jake: Yes. We removed the migration code (for 0.9 to 0.10) in the 0.11 release, I believe. @XiaoChuan: As per Jagadish's recommendation, if you have changelog backed stores, you should upgrade from 0.9.1 to 0.10.0 before upgrading to samza 0.12.0. I checked with LinkedIn's internal release notes. The most significant change listed is adding a new configuration *job.coordinator.system*. This system can be the same as your currently configured checkpoint system (task.checkpoint.system). I am assuming you are using KafkaCheckpointManagerFactory. If you are using other custom checkpoint managers, the migration may be more involved. Please let us know and we can try to help you out. Feel free to email us if you have more questions. Cheers! Navina On Mon, Mar 27, 2017 at 10:07 AM, Jagadish Venkatraman < jagadish1...@gmail.com> wrote: > Good observation Jake! > > The code for migration was removed in Samza 11. The migration would read > change-log offsets from the checkpoint topic and write them to the > coordinator stream. > > If you're using change-logged stores, I'd recommend upgrading from 0.9.1 to > 0.10.0 first. > Otherwise, you will loose offsets for change-logged stores. > > I suspect you should be okay for 0.10.0 to 0.12 upgrade. > > On Mon, Mar 27, 2017 at 9:30 AM, Jacob Maes wrote: > > > As I recall, samza 0.10 introduced the coordinator stream and there was > > code to do an automatic migration to use that feature. @navina, @yi, do > you > > know if that migration code is still in samza 12? > > > > If not, then it's probably better to update from 0.9.1 to 0.10.0 and then > > to 0.12.0. I don't think there were any changes requiring migration > between > > 0.10.and 0.12, so upgrading directly from 0.10 to 0.12 is probably less > of > > an issue. > > > > On Fri, Mar 24, 2017 at 11:05 PM, Jagadish Venkatraman < > > jagadish1...@gmail.com> wrote: > > > > > Hi Xiaochuan, > > > > > > >> Do I need to upgrade Kafka and/or YARN? > > > > > > *Yarn version:* > > > > > > - Samza 0.12 supports Yarn 2.6.1 and 2.7.1. > > > - If you already have 2.6.0 installed (as you have said), I believe > > you > > > will be fine. (but I'm not sure) > > > > > > *Kafka version: * > > > > > > - Samza 0.12 upgraded the version of Kafka to 0.10. > > > - If your Kafka brokers are on an older version of Kafka, you should > > > upgrade them to use at-least 0.10. Kafka clients are usually > > > incompatible with older versions of brokers. > > > > > > *Java version: * > > > > > > > > > > > > - Samza 0.12 binaries are compiled using Java 8. Hence, they cannot > > be > > > run on older versions of the Java run-time. > > > > > > > > > >> I'm extremely new to Samza in terms of operations aspect. I'm not > sure > > > what > > > information would be relevant in this case so please ask away. > > > > > > I'd first start by upgrading the Kafka brokers (assuming you're on Java > > 8+ > > > already). > > > Let us know how the migration goes! > > > > > > Thanks, > > > Jagadish > > > > > > > > > On Fri, Mar 24, 2017 at 8:23 PM, XiaoChuan Yu > > > wrote: > > > > > > > Hi, > > > > > > > > What are the general steps for upgrading Samza from 0.9 to 0.12? > > > > Do I need to upgrade Kafka and/or YARN? > > > > > > > > I don't know how Samza was setup initially but we currently have the > > > > following setup: > > > > > > > > Samza version: 0.9.1 > > > > YARN version: Hadoop 2.6.0-cdh5.4.8 > > > > Kafka version: 0.9.0.1 > > > > > > > > I think installation of Kafka and YARN w
[GitHub] samza pull request #92: SAMZA-1094: Remove MessageEnvelope from public opera...
Github user prateekm closed the pull request at: https://github.com/apache/samza/pull/92 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] samza pull request #92: SAMZA-1094: Remove MessageEnvelope from public opera...
GitHub user prateekm reopened a pull request: https://github.com/apache/samza/pull/92 SAMZA-1094: Remove MessageEnvelope from public operator APIs. SAMZA-1101: Delay the creation of SinkFunction for output streams. SAMZA-1159: Move StreamSpec from a public API to an internal class. Removed the MessageEnvelope and OutputStream interfaces from public operator APIs. Moved the creation of SinkFunction for output streams to SinkOperatorSpec. Moved StreamSpec from a public API to an internal class. Additionally, 1. Removed references to StreamGraph in OperatorSpecs. It was being used to getNextOpId(). MessageStreamsImpl now gets the ID and gives it to OperatorSpecs itself. 2. Updated and cleaned up the StreamGraphBuilder examples. 3. Renamed SinkOperatorSpec to OutputOperatorSpec since its used by sink, sendTo and partitionBy. @nickpan47 and @xinyuiscool, please take a look. You can merge this pull request into a Git repository by running: $ git pull https://github.com/prateekm/samza message-envelope-removal Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/92.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #92 commit c3ae02dcf388db12d595674732643b1828052497 Author: Prateek Maheshwari Date: 2017-03-17T21:32:52Z SAMZA-1094, SAMZA-1101: Remove MessageEnvelope from public operator APIs. Delay the creation of SinkFunction for output streams. commit 865b9db99fd1c9ac34e84cdef7130f0dfd5261d5 Author: Prateek Maheshwari Date: 2017-03-20T21:04:41Z Removed unused method. Minor additional cleanup. commit c98a0806fc95cdcc004900dde647517fd8b4c8ea Author: Prateek Maheshwari Date: 2017-03-22T00:03:39Z SAMZA-1159: Move StreamSpec from a public API to an internal class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Steps to Upgrading Samza (0.9 to 0.12)
@Jake: Yes. We removed the migration code (for 0.9 to 0.10) in the 0.11 release, I believe. @XiaoChuan: As per Jagadish's recommendation, if you have changelog backed stores, you should upgrade from 0.9.1 to 0.10.0 before upgrading to samza 0.12.0. I checked with LinkedIn's internal release notes. The most significant change listed is adding a new configuration *job.coordinator.system*. This system can be the same as your currently configured checkpoint system (task.checkpoint.system). I am assuming you are using KafkaCheckpointManagerFactory. If you are using other custom checkpoint managers, the migration may be more involved. Please let us know and we can try to help you out. Feel free to email us if you have more questions. Cheers! Navina On Mon, Mar 27, 2017 at 10:07 AM, Jagadish Venkatraman < jagadish1...@gmail.com> wrote: > Good observation Jake! > > The code for migration was removed in Samza 11. The migration would read > change-log offsets from the checkpoint topic and write them to the > coordinator stream. > > If you're using change-logged stores, I'd recommend upgrading from 0.9.1 to > 0.10.0 first. > Otherwise, you will loose offsets for change-logged stores. > > I suspect you should be okay for 0.10.0 to 0.12 upgrade. > > On Mon, Mar 27, 2017 at 9:30 AM, Jacob Maes wrote: > > > As I recall, samza 0.10 introduced the coordinator stream and there was > > code to do an automatic migration to use that feature. @navina, @yi, do > you > > know if that migration code is still in samza 12? > > > > If not, then it's probably better to update from 0.9.1 to 0.10.0 and then > > to 0.12.0. I don't think there were any changes requiring migration > between > > 0.10.and 0.12, so upgrading directly from 0.10 to 0.12 is probably less > of > > an issue. > > > > On Fri, Mar 24, 2017 at 11:05 PM, Jagadish Venkatraman < > > jagadish1...@gmail.com> wrote: > > > > > Hi Xiaochuan, > > > > > > >> Do I need to upgrade Kafka and/or YARN? > > > > > > *Yarn version:* > > > > > >- Samza 0.12 supports Yarn 2.6.1 and 2.7.1. > > >- If you already have 2.6.0 installed (as you have said), I believe > > you > > >will be fine. (but I'm not sure) > > > > > > *Kafka version: * > > > > > >- Samza 0.12 upgraded the version of Kafka to 0.10. > > >- If your Kafka brokers are on an older version of Kafka, you should > > >upgrade them to use at-least 0.10. Kafka clients are usually > > >incompatible with older versions of brokers. > > > > > > *Java version: * > > > > > > > > > > > >- Samza 0.12 binaries are compiled using Java 8. Hence, they cannot > > be > > >run on older versions of the Java run-time. > > > > > > > > > >> I'm extremely new to Samza in terms of operations aspect. I'm not > sure > > > what > > > information would be relevant in this case so please ask away. > > > > > > I'd first start by upgrading the Kafka brokers (assuming you're on Java > > 8+ > > > already). > > > Let us know how the migration goes! > > > > > > Thanks, > > > Jagadish > > > > > > > > > On Fri, Mar 24, 2017 at 8:23 PM, XiaoChuan Yu > > > wrote: > > > > > > > Hi, > > > > > > > > What are the general steps for upgrading Samza from 0.9 to 0.12? > > > > Do I need to upgrade Kafka and/or YARN? > > > > > > > > I don't know how Samza was setup initially but we currently have the > > > > following setup: > > > > > > > > Samza version: 0.9.1 > > > > YARN version: Hadoop 2.6.0-cdh5.4.8 > > > > Kafka version: 0.9.0.1 > > > > > > > > I think installation of Kafka and YARN were managed through Puppet. > > > > I'm extremely new to Samza in terms of operations aspect. I'm not > sure > > > what > > > > information would be relevant in this case so please ask away. > > > > > > > > Thanks, > > > > Xiaochuan Yu > > > > > > > > > > > > > > > > -- > > > Jagadish V, > > > Graduate Student, > > > Department of Computer Science, > > > Stanford University > > > > > > > > > -- > Jagadish V, > Graduate Student, > Department of Computer Science, > Stanford University >
[GitHub] samza pull request #90: SAMZA-1143 Universal config support for localized re...
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/90 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Steps to Upgrading Samza (0.9 to 0.12)
Good observation Jake! The code for migration was removed in Samza 11. The migration would read change-log offsets from the checkpoint topic and write them to the coordinator stream. If you're using change-logged stores, I'd recommend upgrading from 0.9.1 to 0.10.0 first. Otherwise, you will loose offsets for change-logged stores. I suspect you should be okay for 0.10.0 to 0.12 upgrade. On Mon, Mar 27, 2017 at 9:30 AM, Jacob Maes wrote: > As I recall, samza 0.10 introduced the coordinator stream and there was > code to do an automatic migration to use that feature. @navina, @yi, do you > know if that migration code is still in samza 12? > > If not, then it's probably better to update from 0.9.1 to 0.10.0 and then > to 0.12.0. I don't think there were any changes requiring migration between > 0.10.and 0.12, so upgrading directly from 0.10 to 0.12 is probably less of > an issue. > > On Fri, Mar 24, 2017 at 11:05 PM, Jagadish Venkatraman < > jagadish1...@gmail.com> wrote: > > > Hi Xiaochuan, > > > > >> Do I need to upgrade Kafka and/or YARN? > > > > *Yarn version:* > > > >- Samza 0.12 supports Yarn 2.6.1 and 2.7.1. > >- If you already have 2.6.0 installed (as you have said), I believe > you > >will be fine. (but I'm not sure) > > > > *Kafka version: * > > > >- Samza 0.12 upgraded the version of Kafka to 0.10. > >- If your Kafka brokers are on an older version of Kafka, you should > >upgrade them to use at-least 0.10. Kafka clients are usually > >incompatible with older versions of brokers. > > > > *Java version: * > > > > > > > >- Samza 0.12 binaries are compiled using Java 8. Hence, they cannot > be > >run on older versions of the Java run-time. > > > > > > >> I'm extremely new to Samza in terms of operations aspect. I'm not sure > > what > > information would be relevant in this case so please ask away. > > > > I'd first start by upgrading the Kafka brokers (assuming you're on Java > 8+ > > already). > > Let us know how the migration goes! > > > > Thanks, > > Jagadish > > > > > > On Fri, Mar 24, 2017 at 8:23 PM, XiaoChuan Yu > > wrote: > > > > > Hi, > > > > > > What are the general steps for upgrading Samza from 0.9 to 0.12? > > > Do I need to upgrade Kafka and/or YARN? > > > > > > I don't know how Samza was setup initially but we currently have the > > > following setup: > > > > > > Samza version: 0.9.1 > > > YARN version: Hadoop 2.6.0-cdh5.4.8 > > > Kafka version: 0.9.0.1 > > > > > > I think installation of Kafka and YARN were managed through Puppet. > > > I'm extremely new to Samza in terms of operations aspect. I'm not sure > > what > > > information would be relevant in this case so please ask away. > > > > > > Thanks, > > > Xiaochuan Yu > > > > > > > > > > > -- > > Jagadish V, > > Graduate Student, > > Department of Computer Science, > > Stanford University > > > -- Jagadish V, Graduate Student, Department of Computer Science, Stanford University
Re: Steps to Upgrading Samza (0.9 to 0.12)
Good observation Jake! The code for migration was removed in Samza 11. The migration would read change-log offsets from the checkpoint topic and write them to the coordinator stream. If you're using change-logged stores, I'd recommend upgrading from 0.9.1 to 0.10.0 first. Otherwise, you will loose offsets for change-logged stores. I suspect you should be okay for 0.10.0 to 0.12 upgrade. On Mon, Mar 27, 2017 at 9:30 AM, Jacob Maes wrote: > As I recall, samza 0.10 introduced the coordinator stream and there was > code to do an automatic migration to use that feature. @navina, @yi, do you > know if that migration code is still in samza 12? > > If not, then it's probably better to update from 0.9.1 to 0.10.0 and then > to 0.12.0. I don't think there were any changes requiring migration between > 0.10.and 0.12, so upgrading directly from 0.10 to 0.12 is probably less of > an issue. > > On Fri, Mar 24, 2017 at 11:05 PM, Jagadish Venkatraman < > jagadish1...@gmail.com> wrote: > > > Hi Xiaochuan, > > > > >> Do I need to upgrade Kafka and/or YARN? > > > > *Yarn version:* > > > >- Samza 0.12 supports Yarn 2.6.1 and 2.7.1. > >- If you already have 2.6.0 installed (as you have said), I believe > you > >will be fine. (but I'm not sure) > > > > *Kafka version: * > > > >- Samza 0.12 upgraded the version of Kafka to 0.10. > >- If your Kafka brokers are on an older version of Kafka, you should > >upgrade them to use at-least 0.10. Kafka clients are usually > >incompatible with older versions of brokers. > > > > *Java version: * > > > > > > > >- Samza 0.12 binaries are compiled using Java 8. Hence, they cannot > be > >run on older versions of the Java run-time. > > > > > > >> I'm extremely new to Samza in terms of operations aspect. I'm not sure > > what > > information would be relevant in this case so please ask away. > > > > I'd first start by upgrading the Kafka brokers (assuming you're on Java > 8+ > > already). > > Let us know how the migration goes! > > > > Thanks, > > Jagadish > > > > > > On Fri, Mar 24, 2017 at 8:23 PM, XiaoChuan Yu > > wrote: > > > > > Hi, > > > > > > What are the general steps for upgrading Samza from 0.9 to 0.12? > > > Do I need to upgrade Kafka and/or YARN? > > > > > > I don't know how Samza was setup initially but we currently have the > > > following setup: > > > > > > Samza version: 0.9.1 > > > YARN version: Hadoop 2.6.0-cdh5.4.8 > > > Kafka version: 0.9.0.1 > > > > > > I think installation of Kafka and YARN were managed through Puppet. > > > I'm extremely new to Samza in terms of operations aspect. I'm not sure > > what > > > information would be relevant in this case so please ask away. > > > > > > Thanks, > > > Xiaochuan Yu > > > > > > > > > > > -- > > Jagadish V, > > Graduate Student, > > Department of Computer Science, > > Stanford University > > > -- Jagadish V, Graduate Student, Department of Computer Science, Stanford University
Re: Steps to Upgrading Samza (0.9 to 0.12)
As I recall, samza 0.10 introduced the coordinator stream and there was code to do an automatic migration to use that feature. @navina, @yi, do you know if that migration code is still in samza 12? If not, then it's probably better to update from 0.9.1 to 0.10.0 and then to 0.12.0. I don't think there were any changes requiring migration between 0.10.and 0.12, so upgrading directly from 0.10 to 0.12 is probably less of an issue. On Fri, Mar 24, 2017 at 11:05 PM, Jagadish Venkatraman < jagadish1...@gmail.com> wrote: > Hi Xiaochuan, > > >> Do I need to upgrade Kafka and/or YARN? > > *Yarn version:* > >- Samza 0.12 supports Yarn 2.6.1 and 2.7.1. >- If you already have 2.6.0 installed (as you have said), I believe you >will be fine. (but I'm not sure) > > *Kafka version: * > >- Samza 0.12 upgraded the version of Kafka to 0.10. >- If your Kafka brokers are on an older version of Kafka, you should >upgrade them to use at-least 0.10. Kafka clients are usually >incompatible with older versions of brokers. > > *Java version: * > > > >- Samza 0.12 binaries are compiled using Java 8. Hence, they cannot be >run on older versions of the Java run-time. > > > >> I'm extremely new to Samza in terms of operations aspect. I'm not sure > what > information would be relevant in this case so please ask away. > > I'd first start by upgrading the Kafka brokers (assuming you're on Java 8+ > already). > Let us know how the migration goes! > > Thanks, > Jagadish > > > On Fri, Mar 24, 2017 at 8:23 PM, XiaoChuan Yu > wrote: > > > Hi, > > > > What are the general steps for upgrading Samza from 0.9 to 0.12? > > Do I need to upgrade Kafka and/or YARN? > > > > I don't know how Samza was setup initially but we currently have the > > following setup: > > > > Samza version: 0.9.1 > > YARN version: Hadoop 2.6.0-cdh5.4.8 > > Kafka version: 0.9.0.1 > > > > I think installation of Kafka and YARN were managed through Puppet. > > I'm extremely new to Samza in terms of operations aspect. I'm not sure > what > > information would be relevant in this case so please ask away. > > > > Thanks, > > Xiaochuan Yu > > > > > > -- > Jagadish V, > Graduate Student, > Department of Computer Science, > Stanford University >