[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241301&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241301 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 13/May/19 20:46 Start Date: 13/May/19 20:46 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283529750 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/worker/package-info.java ## @@ -0,0 +1,20 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** Utilities for configuring worker environment. */ +package org.apache.beam.sdk.worker; Review comment: @lukecwik do you have any objection to dropping this in the top-level of sdk core and calling it something like `BeamJvmInitializer` instead? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241301) Time Spent: 6h (was: 5h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 6h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241322&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241322 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 13/May/19 21:29 Start Date: 13/May/19 21:29 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283544945 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/worker/package-info.java ## @@ -0,0 +1,20 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** Utilities for configuring worker environment. */ +package org.apache.beam.sdk.worker; Review comment: I wouldn't stick this in `org.apache.beam.sdk` because there is purposefully very little there (`Pipeline`, `PipelineResult`, ...) If the term `worker` is the issue, then I would suggest using `harness` and put this in `org.apache.beam.sdk.harness` and name the class `BeamHarnessInitializer`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241322) Time Spent: 6h 10m (was: 6h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241387 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 13/May/19 22:56 Start Date: 13/May/19 22:56 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283569050 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/worker/package-info.java ## @@ -0,0 +1,20 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** Utilities for configuring worker environment. */ +package org.apache.beam.sdk.worker; Review comment: Ok I renamed the `worker` package to `harness` and `BeamWorkerInitializer` to `JvmInitializer` (I figured "beam" and "harness" were both redundant, while "JVM" communicates that this is meant to initialize any JVMs that are started). So now the new interface and class are `org.apache.beam.sdk.harness.JvmInitializer` and `org.apache.beam.fn.harness.JvmInitializers`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241387) Time Spent: 6h 20m (was: 6h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 6h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241841&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241841 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283892194 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only + * supported in the portable worker and legacy Dataflow worker. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link JvmInitializer}, + * note that you will need to register your implementation with the appropriate resources to ensure + * your code is executed. You can use a tool like {@link com.google.auto.service.AutoService} to + * automate this. + */ +@Experimental +public interface JvmInitializer { + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins + * running. Review comment: ```suggestion * for pipeline execution. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241841) Time Spent: 7h (was: 6h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241839&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241839 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283894172 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only Review comment: ```suggestion * at the appropriate stage of execution after the JVM is launched. Currently this is only ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241839) Time Spent: 6h 50m (was: 6h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 6h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241844&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241844 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283892726 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only + * supported in the portable worker and legacy Dataflow worker. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link JvmInitializer}, + * note that you will need to register your implementation with the appropriate resources to ensure + * your code is executed. You can use a tool like {@link com.google.auto.service.AutoService} to + * automate this. + */ +@Experimental +public interface JvmInitializer { + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins + * running. + * + * In general users should prefer to implement {@code beforeProcessing} to perform custom + * initialization so that basic services such as logging can be initialized first, but {@code + * onStartup} is also provided if initialization absolutely needs to be run immediately after + * starting. + */ + default void onStartup() {} + + /** + * Implement beforeProcessing to run some custom initialization after the worker initializes Review comment: ```suggestion * Implement beforeProcessing to run some custom initialization after basic services such ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241844) Time Spent: 7.5h (was: 7h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241843&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241843 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283891595 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only + * supported in the portable worker and legacy Dataflow worker. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link JvmInitializer}, + * note that you will need to register your implementation with the appropriate resources to ensure + * your code is executed. You can use a tool like {@link com.google.auto.service.AutoService} to + * automate this. + */ +@Experimental +public interface JvmInitializer { + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins Review comment: ```suggestion * Implement onStartup to run some custom initialization immediately after the JVM is launched ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241843) Time Spent: 7h 20m (was: 7h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241842&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241842 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283892935 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only + * supported in the portable worker and legacy Dataflow worker. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link JvmInitializer}, + * note that you will need to register your implementation with the appropriate resources to ensure + * your code is executed. You can use a tool like {@link com.google.auto.service.AutoService} to + * automate this. + */ +@Experimental +public interface JvmInitializer { + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins + * running. + * + * In general users should prefer to implement {@code beforeProcessing} to perform custom + * initialization so that basic services such as logging can be initialized first, but {@code + * onStartup} is also provided if initialization absolutely needs to be run immediately after + * starting. + */ + default void onStartup() {} + + /** + * Implement beforeProcessing to run some custom initialization after the worker initializes + * itself, but before it begins process data. Review comment: ```suggestion * as logging, but before data processing begins. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241842) Time Spent: 7h 10m (was: 7h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241840&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241840 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283890722 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. Review comment: ```suggestion * A service interface for defining one-time initialization of the JVM during pipeline execution. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241840) Time Spent: 7h (was: 6h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241838&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241838 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283894200 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only + * supported in the portable worker and legacy Dataflow worker. Review comment: ```suggestion * supported in portable pipelines or when using Google Cloud Dataflow. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241838) Time Spent: 6h 40m (was: 6.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 6h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=241837&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-241837 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 14/May/19 16:38 Start Date: 14/May/19 16:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r283894078 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.harness; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code Review comment: ```suggestion * During pipeline execution, {@code onStartup} and {@code beforeProcessing} will be invoked ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 241837) Time Spent: 6.5h (was: 6h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 6.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=245720&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-245720 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 21/May/19 03:18 Start Date: 21/May/19 03:18 Worklog Time Spent: 10m Work Description: aaltay commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-494224191 LGTM. Can we merge this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 245720) Time Spent: 7h 40m (was: 7.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=246262&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-246262 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 21/May/19 18:02 Start Date: 21/May/19 18:02 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-494496295 I'm fine with merging it as is. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 246262) Time Spent: 7h 50m (was: 7h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 7h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=246311&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-246311 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 21/May/19 18:59 Start Date: 21/May/19 18:59 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 246311) Time Spent: 8h (was: 7h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 8h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=216521&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216521 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 20/Mar/19 23:10 Start Date: 20/Mar/19 23:10 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104 Adds DataflowWorkerInitializer interface. Workers execute implementations of the interface in user code when they start up via ServiceLoader. Currently implementations are run immediately after logging is configured. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216521) Time Spent: 10m Remaining Estimate: 0h > Add hook for user-defined JVM initialization in workers > --- > >
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=216538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216538 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 20/Mar/19 23:52 Start Date: 20/Mar/19 23:52 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-475071702 @lukecwik are you a good person to review this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216538) Time Spent: 20m (was: 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=219665&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-219665 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 27/Mar/19 22:22 Start Date: 27/Mar/19 22:22 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r269794177 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java ## @@ -95,6 +97,17 @@ public static void configureLogging(DataflowWorkerHarnessOptions pipelineOptions DataflowWorkerLoggingInitializer.configure(pipelineOptions); } + public static void runUserDefinedInitialization() { +ServiceLoader loader = +ServiceLoader.load(DataflowWorkerInitializer.class); +for (DataflowWorkerInitializer initializer : loader) { Review comment: You should be able to write a test with a Test "initializer" that gets detected, loaded and performs some detectable side effect. I think there are several examples for serviceloader tests in the code base already. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 219665) Time Spent: 50m (was: 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=219662&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-219662 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 27/Mar/19 22:22 Start Date: 27/Mar/19 22:22 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r269790836 ## File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowWorkerInitializer.java ## @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.dataflow; + +import java.util.ServiceLoader; +import org.apache.beam.sdk.annotations.Experimental; + +/** + * A service interface for defining one-time initialization for Dataflow workers. + * + * Dataflow workers will use {@link ServiceLoader} to run every setup() implementation as soon as + * possible after start up. In general this should occur immediately after logging is setup and Review comment: Add this to the main SDK core and also add this into the sdk/java/harness initialization in addition to where you have it inside the Dataflow worker. Call it something like "BeamWorkerInitializer" This will also allow all portable runners to get this feature. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 219662) Time Spent: 0.5h (was: 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=219663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-219663 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 27/Mar/19 22:22 Start Date: 27/Mar/19 22:22 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r269791386 ## File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowWorkerInitializer.java ## @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.dataflow; + +import java.util.ServiceLoader; +import org.apache.beam.sdk.annotations.Experimental; + +/** + * A service interface for defining one-time initialization for Dataflow workers. + * + * Dataflow workers will use {@link ServiceLoader} to run every setup() implementation as soon as + * possible after start up. In general this should occur immediately after logging is setup and + * configured, but before the worker begins processing elements. + */ +@Experimental +public interface DataflowWorkerInitializer { + void setup(); Review comment: add two methods: * onStartup(): This is called as soon as we enter main(). * beforeProcessing(PipelineOptions options): This is called before we start processing. make them have default no-op methods. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 219663) Time Spent: 0.5h (was: 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=219664&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-219664 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 27/Mar/19 22:22 Start Date: 27/Mar/19 22:22 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r269791744 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java ## @@ -95,6 +97,17 @@ public static void configureLogging(DataflowWorkerHarnessOptions pipelineOptions DataflowWorkerLoggingInitializer.configure(pipelineOptions); } + public static void runUserDefinedInitialization() { +ServiceLoader loader = Review comment: Sort these so there is a stable order based upon the full qualified class name so if the user rebuilds their application they should get a stable execution order across runs. Also use the ReflectHelper to find the class loader. You could copy most of the code from here: https://github.com/apache/beam/blob/1d9daf1aca101fa5a194cbbba969886734e08902/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java#L1793 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 219664) Time Spent: 40m (was: 0.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=220247&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220247 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 28/Mar/19 21:05 Start Date: 28/Mar/19 21:05 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r270198944 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java ## @@ -95,6 +97,17 @@ public static void configureLogging(DataflowWorkerHarnessOptions pipelineOptions DataflowWorkerLoggingInitializer.configure(pipelineOptions); } + public static void runUserDefinedInitialization() { +ServiceLoader loader = Review comment: Do you think it would make sense to add a helper to ReflectHelpers like `loadServicesOrdered(clazz)` for this pattern? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 220247) Time Spent: 1h (was: 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=221363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221363 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 01/Apr/19 17:14 Start Date: 01/Apr/19 17:14 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r270967393 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java ## @@ -95,6 +97,17 @@ public static void configureLogging(DataflowWorkerHarnessOptions pipelineOptions DataflowWorkerLoggingInitializer.configure(pipelineOptions); } + public static void runUserDefinedInitialization() { +ServiceLoader loader = Review comment: Sounds worthwhile. Since it has been copied a few times now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221363) Time Spent: 1h 10m (was: 1h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224503&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224503 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:00 Start Date: 08/Apr/19 18:00 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273170518 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java ## @@ -95,6 +97,17 @@ public static void configureLogging(DataflowWorkerHarnessOptions pipelineOptions DataflowWorkerLoggingInitializer.configure(pipelineOptions); } + public static void runUserDefinedInitialization() { +ServiceLoader loader = +ServiceLoader.load(DataflowWorkerInitializer.class); +for (DataflowWorkerInitializer initializer : loader) { Review comment: I moved all the relevant logic into BeamWorkerInitializer and tested there. It's kind of tricky to test the worker harnesses directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224503) Time Spent: 1h 20m (was: 1h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224504 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:06 Start Date: 08/Apr/19 18:06 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273172885 ## File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowWorkerInitializer.java ## @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.dataflow; + +import java.util.ServiceLoader; +import org.apache.beam.sdk.annotations.Experimental; + +/** + * A service interface for defining one-time initialization for Dataflow workers. + * + * Dataflow workers will use {@link ServiceLoader} to run every setup() implementation as soon as + * possible after start up. In general this should occur immediately after logging is setup and + * configured, but before the worker begins processing elements. + */ +@Experimental +public interface DataflowWorkerInitializer { + void setup(); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224504) Time Spent: 1.5h (was: 1h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224505&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224505 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:07 Start Date: 08/Apr/19 18:07 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273173017 ## File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowWorkerInitializer.java ## @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.dataflow; + +import java.util.ServiceLoader; +import org.apache.beam.sdk.annotations.Experimental; + +/** + * A service interface for defining one-time initialization for Dataflow workers. + * + * Dataflow workers will use {@link ServiceLoader} to run every setup() implementation as soon as + * possible after start up. In general this should occur immediately after logging is setup and Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224505) Time Spent: 1h 40m (was: 1.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224506 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:07 Start Date: 08/Apr/19 18:07 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273173083 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowWorkerHarnessHelper.java ## @@ -95,6 +97,17 @@ public static void configureLogging(DataflowWorkerHarnessOptions pipelineOptions DataflowWorkerLoggingInitializer.configure(pipelineOptions); } + public static void runUserDefinedInitialization() { +ServiceLoader loader = Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224506) Time Spent: 1h 50m (was: 1h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224540 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:43 Start Date: 08/Apr/19 18:43 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273185912 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/util/BeamWorkerInitializer.java ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.util; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.util.common.ReflectHelpers; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will use {@code runOnStartup} and {@code runBeforeProcessing} to run every + * registered implementation's {@code onStartup} and {@code beforeProcessing} functions at the + * appropriate stage of execution. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link + * BeamWorkerInitializer}, note that you will need to register your implementation with the + * appropriate resources to ensure your code is executed. You can use a tool like {@link + * com.google.auto.service.AutoService} to automate this. + */ +@Experimental +public abstract class BeamWorkerInitializer { + /** + * Finds all registered implementations of BeamWorkerInitializer and executes their {@code + * onStartup} methods. Called in worker harness implementations at the very beginning of their + * main method. + */ + public static void runOnStartup() { +for (BeamWorkerInitializer initializer : +ReflectHelpers.loadServicesOrdered(BeamWorkerInitializer.class)) { + initializer.onStartup(); +} + } + + /** + * Finds all registered implementations of BeamWorkerInitializer and executes their {@code + * beforeProcessing} methods. Called in worker harness implementations after initialization but + * before beginning to process any data. + * + * @param options The pipeline options passed to the worker. + */ + public static void runBeforeProcessing(PipelineOptions options) { +for (BeamWorkerInitializer initializer : +ReflectHelpers.loadServicesOrdered(BeamWorkerInitializer.class)) { + initializer.beforeProcessing(options); +} + } + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins Review comment: I would move the methods you want people to implement to be at the top and hide the static methods in a different class inside the beam-fn-execution package. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224540) Time Spent: 2h 10m (was: 2h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224539&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224539 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:43 Start Date: 08/Apr/19 18:43 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273186639 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/util/BeamWorkerInitializer.java ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.util; Review comment: We have been telling users to not use code in 'util' as it isn't backwards compatible. The other packages don't make sense so maybe we should start a 'org.apache.beam.sdk.worker' package. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224539) Time Spent: 2h (was: 1h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=224538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224538 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 08/Apr/19 18:43 Start Date: 08/Apr/19 18:43 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273185848 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/util/BeamWorkerInitializer.java ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.util; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.util.common.ReflectHelpers; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will use {@code runOnStartup} and {@code runBeforeProcessing} to run every + * registered implementation's {@code onStartup} and {@code beforeProcessing} functions at the + * appropriate stage of execution. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link + * BeamWorkerInitializer}, note that you will need to register your implementation with the + * appropriate resources to ensure your code is executed. You can use a tool like {@link + * com.google.auto.service.AutoService} to automate this. + */ Review comment: I would add a comment that this only supported on Dataflow and portable runners. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224538) Time Spent: 2h (was: 1h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225196&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225196 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 09/Apr/19 18:26 Start Date: 09/Apr/19 18:26 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273640379 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/util/BeamWorkerInitializer.java ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.util; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.util.common.ReflectHelpers; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will use {@code runOnStartup} and {@code runBeforeProcessing} to run every + * registered implementation's {@code onStartup} and {@code beforeProcessing} functions at the + * appropriate stage of execution. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link + * BeamWorkerInitializer}, note that you will need to register your implementation with the + * appropriate resources to ensure your code is executed. You can use a tool like {@link + * com.google.auto.service.AutoService} to automate this. + */ +@Experimental +public abstract class BeamWorkerInitializer { + /** + * Finds all registered implementations of BeamWorkerInitializer and executes their {@code + * onStartup} methods. Called in worker harness implementations at the very beginning of their + * main method. + */ + public static void runOnStartup() { +for (BeamWorkerInitializer initializer : +ReflectHelpers.loadServicesOrdered(BeamWorkerInitializer.class)) { + initializer.onStartup(); +} + } + + /** + * Finds all registered implementations of BeamWorkerInitializer and executes their {@code + * beforeProcessing} methods. Called in worker harness implementations after initialization but + * before beginning to process any data. + * + * @param options The pipeline options passed to the worker. + */ + public static void runBeforeProcessing(PipelineOptions options) { +for (BeamWorkerInitializer initializer : +ReflectHelpers.loadServicesOrdered(BeamWorkerInitializer.class)) { + initializer.beforeProcessing(options); +} + } + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins Review comment: I moved the static methods to `org.apache.beam.sdk.fn.BeamWorkerInitializerHelpers` - is that what you had in mind? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225196) Time Spent: 2h 20m (was: 2h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225308&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225308 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 09/Apr/19 22:23 Start Date: 09/Apr/19 22:23 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273728245 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/util/BeamWorkerInitializer.java ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.util; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.util.common.ReflectHelpers; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will use {@code runOnStartup} and {@code runBeforeProcessing} to run every + * registered implementation's {@code onStartup} and {@code beforeProcessing} functions at the + * appropriate stage of execution. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link + * BeamWorkerInitializer}, note that you will need to register your implementation with the + * appropriate resources to ensure your code is executed. You can use a tool like {@link + * com.google.auto.service.AutoService} to automate this. + */ +@Experimental +public abstract class BeamWorkerInitializer { + /** + * Finds all registered implementations of BeamWorkerInitializer and executes their {@code + * onStartup} methods. Called in worker harness implementations at the very beginning of their + * main method. + */ + public static void runOnStartup() { +for (BeamWorkerInitializer initializer : +ReflectHelpers.loadServicesOrdered(BeamWorkerInitializer.class)) { + initializer.onStartup(); +} + } + + /** + * Finds all registered implementations of BeamWorkerInitializer and executes their {@code + * beforeProcessing} methods. Called in worker harness implementations after initialization but + * before beginning to process any data. + * + * @param options The pipeline options passed to the worker. + */ + public static void runBeforeProcessing(PipelineOptions options) { +for (BeamWorkerInitializer initializer : +ReflectHelpers.loadServicesOrdered(BeamWorkerInitializer.class)) { + initializer.beforeProcessing(options); +} + } + + /** + * Implement onStartup to run some custom initialization immediately after the worker begins Review comment: Yes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225308) Time Spent: 2.5h (was: 2h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225310 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 09/Apr/19 22:31 Start Date: 09/Apr/19 22:31 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r273729514 ## File path: sdks/java/fn-execution/src/main/java/org/apache/beam/sdk/fn/BeamWorkerInitializerHelpers.java ## @@ -0,0 +1,51 @@ +/* Review comment: Could we rename this to BeamWorkerInitializers, this will follow the pattern of most of our other static helper like classes. (Yes there are a few which are still named "Helpers") This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225310) Time Spent: 2h 40m (was: 2.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225666&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225666 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 16:17 Start Date: 10/Apr/19 16:17 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481759150 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225666) Time Spent: 2h 50m (was: 2h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 2h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225667&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225667 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 16:17 Start Date: 10/Apr/19 16:17 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481759150 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225667) Time Spent: 3h (was: 2h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 3h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225669&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225669 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 16:17 Start Date: 10/Apr/19 16:17 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481759366 I'll squash and merge when tests are Green. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225669) Time Spent: 3h 10m (was: 3h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 3h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225717&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225717 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 17:10 Start Date: 10/Apr/19 17:10 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481779041 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225717) Time Spent: 3h 20m (was: 3h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 3h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225756&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225756 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 18:06 Start Date: 10/Apr/19 18:06 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481800250 Hopefully tests pass now that I've added docstrings for the FnHarness main methods. I'd like to take a crack at adding tests to the portable worker using your continuation method idea before merging though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225756) Time Spent: 3.5h (was: 3h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 3.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225809&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225809 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 19:41 Start Date: 10/Apr/19 19:41 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481834078 I think I almost have a good test for this behavior in the portable worker, I just had to create a version of main that lets me inject a mock of `System.getenv`. I'll push that up shortly. I also discussed this offline a bit with @kennknowles, and he suggested we should probably discuss this on the mailing list before merging since it's a pretty major change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225809) Time Spent: 3h 40m (was: 3.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 3h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225883&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225883 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 10/Apr/19 23:30 Start Date: 10/Apr/19 23:30 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481909886 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225883) Time Spent: 3h 50m (was: 3h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 3h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225926&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225926 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:47 Start Date: 11/Apr/19 02:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274199585 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/worker/BeamWorkerInitializer.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.worker; + +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.sdk.options.PipelineOptions; + +/** + * A service interface for defining one-time initialization for Beam workers. + * + * Beam workers will run every registered implementation's {@code onStartup} and {@code + * beforeProcessing} functions at the appropriate stage of execution. Currently this is only + * supported in the portable worker and legacy Dataflow worker. + * + * {@link java.util.ServiceLoader} is used to discover implementations of {@link + * BeamWorkerInitializer}, note that you will need to register your implementation with the + * appropriate resources to ensure your code is executed. You can use a tool like {@link + * com.google.auto.service.AutoService} to automate this. + */ +@Experimental +public abstract class BeamWorkerInitializer { Review comment: In the days of yore an abstract class was preferred because you could not have default method implemention on an interface. Nowadays you can, so there's no real value to using the class hierarchy at all (since you should use composition to reuse implementation). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225926) Time Spent: 4h (was: 3h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225928&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225928 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:47 Start Date: 11/Apr/19 02:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274199693 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/worker/package-info.java ## @@ -0,0 +1,20 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** Utilities for configuring worker environment. */ +package org.apache.beam.sdk.worker; Review comment: If you must make a new package, annotate with nonnull by default - poke around other packages to find this. Naming bikeshed: this is a JVM initializer, not a worker initializer, right? If a "worker" (which could mean many things to different runners, and is not a Beam concept) runs a bunch of JVMs they all should execute this? So I would just drop it in the top level of sdk core. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225928) Time Spent: 4h 20m (was: 4h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225927&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225927 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:47 Start Date: 11/Apr/19 02:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274240721 ## File path: sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnHarnessTest.java ## @@ -56,13 +67,39 @@ .setRegister(BeamFnApi.RegisterResponse.getDefaultInstance()) .build(); + private static @Mock Runnable onStartupMock = mock(Runnable.class); + private static @Mock Consumer beforeProcessingMock = mock(Consumer.class); + + /** + * Fake BeamWorkerInitializer that simply forwards calls to mocked functions so that they can be + * observed in tests. + */ + @AutoService(BeamWorkerInitializer.class) + public static class FnHarnessTestInitializer extends BeamWorkerInitializer { +@Override +public void onStartup() { + onStartupMock.run(); +} + +@Override +public void beforeProcessing(PipelineOptions options) { + beforeProcessingMock.accept(options); +} + } + @Test(timeout = 10 * 1000) @SuppressWarnings("FutureReturnValueIgnored") // failure will cause test to timeout. public void testLaunchFnHarnessAndTeardownCleanly() throws Exception { +Function environmentVariableMock = mock(Function.class); + PipelineOptions options = PipelineOptionsFactory.create(); +when(environmentVariableMock.apply("HARNESS_ID")).thenReturn("id"); Review comment: Does mockito add anything here? A simple lambda is clear This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225927) Time Spent: 4h 10m (was: 4h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225930&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225930 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:47 Start Date: 11/Apr/19 02:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274241047 ## File path: sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnHarnessTest.java ## @@ -118,13 +156,26 @@ public void testLaunchFnHarnessAndTeardownCleanly() throws Exception { .setUrl("localhost:" + controlServer.getPort()) .build(); -FnHarness.main("id", options, loggingDescriptor, controlDescriptor); -assertThat(instructionResponses, contains(INSTRUCTION_RESPONSE)); +when(environmentVariableMock.apply("LOGGING_API_SERVICE_DESCRIPTOR")) +.thenReturn(TextFormat.printToString(loggingDescriptor)); +when(environmentVariableMock.apply("CONTROL_API_SERVICE_DESCRIPTOR")) +.thenReturn(TextFormat.printToString(controlDescriptor)); + +FnHarness.main(environmentVariableMock); } finally { controlServer.shutdownNow(); } } finally { loggingServer.shutdownNow(); } + +// Verify that we first run onStartup functions before even reading the environment, and that +// we then call beforeProcessing functions before executing instructions. +InOrder inOrder = +inOrder(onStartupMock, beforeProcessingMock, environmentVariableMock, instructionResponses); +inOrder.verify(onStartupMock).run(); +inOrder.verify(environmentVariableMock, atLeastOnce()).apply(any()); +inOrder.verify(beforeProcessingMock).accept(any()); +inOrder.verify(instructionResponses).add(INSTRUCTION_RESPONSE); Review comment: I strongly dislike this test. (sorry!) What is the actual _behavior_ that should change based on the mock. I really don't think a mock is necessary or a good idea here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225930) Time Spent: 4h 40m (was: 4.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225929&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225929 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:47 Start Date: 11/Apr/19 02:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274240813 ## File path: sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnHarnessTest.java ## @@ -118,13 +156,26 @@ public void testLaunchFnHarnessAndTeardownCleanly() throws Exception { .setUrl("localhost:" + controlServer.getPort()) .build(); -FnHarness.main("id", options, loggingDescriptor, controlDescriptor); -assertThat(instructionResponses, contains(INSTRUCTION_RESPONSE)); +when(environmentVariableMock.apply("LOGGING_API_SERVICE_DESCRIPTOR")) +.thenReturn(TextFormat.printToString(loggingDescriptor)); +when(environmentVariableMock.apply("CONTROL_API_SERVICE_DESCRIPTOR")) +.thenReturn(TextFormat.printToString(controlDescriptor)); + +FnHarness.main(environmentVariableMock); } finally { controlServer.shutdownNow(); } } finally { loggingServer.shutdownNow(); } + +// Verify that we first run onStartup functions before even reading the environment, and that +// we then call beforeProcessing functions before executing instructions. +InOrder inOrder = +inOrder(onStartupMock, beforeProcessingMock, environmentVariableMock, instructionResponses); +inOrder.verify(onStartupMock).run(); Review comment: Do you need to verify this if the result is correct? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225929) Time Spent: 4.5h (was: 4h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225931&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225931 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:47 Start Date: 11/Apr/19 02:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274240770 ## File path: sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnHarnessTest.java ## @@ -118,13 +156,26 @@ public void testLaunchFnHarnessAndTeardownCleanly() throws Exception { .setUrl("localhost:" + controlServer.getPort()) .build(); -FnHarness.main("id", options, loggingDescriptor, controlDescriptor); -assertThat(instructionResponses, contains(INSTRUCTION_RESPONSE)); +when(environmentVariableMock.apply("LOGGING_API_SERVICE_DESCRIPTOR")) +.thenReturn(TextFormat.printToString(loggingDescriptor)); +when(environmentVariableMock.apply("CONTROL_API_SERVICE_DESCRIPTOR")) +.thenReturn(TextFormat.printToString(controlDescriptor)); + +FnHarness.main(environmentVariableMock); } finally { controlServer.shutdownNow(); } } finally { loggingServer.shutdownNow(); } + +// Verify that we first run onStartup functions before even reading the environment, and that +// we then call beforeProcessing functions before executing instructions. +InOrder inOrder = +inOrder(onStartupMock, beforeProcessingMock, environmentVariableMock, instructionResponses); +inOrder.verify(onStartupMock).run(); +inOrder.verify(environmentVariableMock, atLeastOnce()).apply(any()); Review comment: Don't do `verify` when the return value of calling the function is adequate verification.` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225931) Time Spent: 4h 40m (was: 4.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225937&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225937 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 02:55 Start Date: 11/Apr/19 02:55 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-481945621 Incidentally, the change to ReflectHelpers seems nice separately. Deterministic ordering is good for reproducing issues. Though we should consider the contents of the classpath to be nondeterministic because it is fairly unpredictable. So it does _not_ allow one to actually reason about what executes first. Only reproduce a run. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225937) Time Spent: 4h 50m (was: 4h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=225943&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-225943 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 11/Apr/19 03:11 Start Date: 11/Apr/19 03:11 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r274243735 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowBatchWorkerHarness.java ## @@ -53,12 +54,14 @@ private DataflowBatchWorkerHarness(DataflowWorkerHarnessOptions pipelineOptions) /** Creates the worker harness and then runs it. */ public static void main(String[] args) throws Exception { +BeamWorkerInitializers.runOnStartup(); Review comment: There is an assumption that this will be the first line ever for processing, right? Is it possible to ensure this? If not, should we add a comment? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 225943) Time Spent: 5h (was: 4h 50m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=228562&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-228562 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 16/Apr/19 17:00 Start Date: 16/Apr/19 17:00 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r275899738 ## File path: sdks/java/harness/src/test/java/org/apache/beam/fn/harness/FnHarnessTest.java ## @@ -56,13 +67,39 @@ .setRegister(BeamFnApi.RegisterResponse.getDefaultInstance()) .build(); + private static @Mock Runnable onStartupMock = mock(Runnable.class); + private static @Mock Consumer beforeProcessingMock = mock(Consumer.class); + + /** + * Fake BeamWorkerInitializer that simply forwards calls to mocked functions so that they can be + * observed in tests. + */ + @AutoService(BeamWorkerInitializer.class) + public static class FnHarnessTestInitializer extends BeamWorkerInitializer { +@Override +public void onStartup() { + onStartupMock.run(); +} + +@Override +public void beforeProcessing(PipelineOptions options) { + beforeProcessingMock.accept(options); +} + } + @Test(timeout = 10 * 1000) @SuppressWarnings("FutureReturnValueIgnored") // failure will cause test to timeout. public void testLaunchFnHarnessAndTeardownCleanly() throws Exception { +Function environmentVariableMock = mock(Function.class); + PipelineOptions options = PipelineOptionsFactory.create(); +when(environmentVariableMock.apply("HARNESS_ID")).thenReturn("id"); Review comment: Mockito just gives me an easy way to verify the order in which things occurred. I decided to use a mock for the environment variable accessor so I could use reading it as a proxy for "starting to do anything other than running BeamWorkerInitialzier.onStartup implementations." That way I can assert the proper ordering occured: `onStartup` called, worker initialization (signaled by accessing environment), `beforeProcessing` called, start processing data. I'm definitely fooling myself a bit here, this doesn't really assert that `onStartup` is _the first_ thing that happened, just that it happened before any initialization based on environment variables. Do you think it would be better to just use a lambda for the environment variable accessor, and just assert the order `onStartup` -> `beforeProcessing` -> process data? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 228562) Time Spent: 5h 10m (was: 5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 5h 10m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=229439&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229439 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 17/Apr/19 22:13 Start Date: 17/Apr/19 22:13 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-484281075 Kenn, the execution order will be deterministic as long as there is a unique class per service on the classpath. All bets are off once users start defining multiple classes with the same name across multiple jars. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 229439) Time Spent: 5h 20m (was: 5h 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 5h 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=229902&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229902 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 18/Apr/19 20:20 Start Date: 18/Apr/19 20:20 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r276818750 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowBatchWorkerHarness.java ## @@ -53,12 +54,14 @@ private DataflowBatchWorkerHarness(DataflowWorkerHarnessOptions pipelineOptions) /** Creates the worker harness and then runs it. */ public static void main(String[] args) throws Exception { +BeamWorkerInitializers.runOnStartup(); Review comment: Do you mean add a comment so that no one moves it in the future? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 229902) Time Spent: 5.5h (was: 5h 20m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 5.5h > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=229903&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229903 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 18/Apr/19 20:25 Start Date: 18/Apr/19 20:25 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-484677248 Deterministic execution order is good for reproducing errors, but bad for a user to actually depend on. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 229903) Time Spent: 5h 40m (was: 5.5h) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 5h 40m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=229956&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229956 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 19/Apr/19 00:06 Start Date: 19/Apr/19 00:06 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#discussion_r276871571 ## File path: runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/DataflowBatchWorkerHarness.java ## @@ -53,12 +54,14 @@ private DataflowBatchWorkerHarness(DataflowWorkerHarnessOptions pipelineOptions) /** Creates the worker harness and then runs it. */ public static void main(String[] args) throws Exception { +BeamWorkerInitializers.runOnStartup(); Review comment: Yes, that works. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 229956) Time Spent: 5h 50m (was: 5h 40m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 5h 50m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)