[jira] [Updated] (BEAM-14556) Honor custom formatters being installed on the root logging handler
[ https://issues.apache.org/jira/browse/BEAM-14556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14556: - Fix Version/s: 2.40.0 Resolution: Fixed Status: Resolved (was: Open) > Honor custom formatters being installed on the root logging handler > --- > > Key: BEAM-14556 > URL: https://issues.apache.org/jira/browse/BEAM-14556 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow, sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.40.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > This will allow users to create custom formatters integrating with an MDC if > they so choose by using a JvmInitializer to install their custom formatter on > the root logging handler. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-6258) Data channel failing after some time for 1G data input
[ https://issues.apache.org/jira/browse/BEAM-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-6258: Fix Version/s: 2.40.0 Resolution: Fixed Status: Resolved (was: Open) > Data channel failing after some time for 1G data input > -- > > Key: BEAM-6258 > URL: https://issues.apache.org/jira/browse/BEAM-6258 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness >Reporter: Ankur Goenka >Assignee: Luke Cwik >Priority: P3 > Fix For: 2.40.0 > > Attachments: d44b7eda9e4c_java_server_logs.logs.gz, > d44b7eda9e4c_python_client_logs.log.bz2 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Data channel and logging channel are failing after some time with 1GB input > data for chicago taxi. > > E1218 02:44:02.837680206 72 chttp2_transport.cc:1148] Received a GOAWAY with > error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings" > Exception in thread read_grpc_client_inputs: > Traceback (most recent call last): > File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner > self.run() > File "/usr/local/lib/python2.7/threading.py", line 754, in run > self.__target(*self.__args, **self.__kwargs) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 273, in > target=lambda: self._read_inputs(elements_iterator), > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 145, in _execute > response = task() > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 180, in > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 269, in process_bundle > bundle_processor.process_bundle(instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in process_bundle > instruction_id, expected_targets): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 209, in input_elements > raise_(t, v, tb) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 145, in _execute > response = task() > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 180, in > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 269, in process_bundle > bundle_processor.process_bundle(instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in process_bundle > instruction_id, expected_targets): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 209, in input_elements > raise_(t, v, tb) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RE
[jira] [Assigned] (BEAM-6258) Data channel failing after some time for 1G data input
[ https://issues.apache.org/jira/browse/BEAM-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-6258: --- Assignee: Luke Cwik > Data channel failing after some time for 1G data input > -- > > Key: BEAM-6258 > URL: https://issues.apache.org/jira/browse/BEAM-6258 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness >Reporter: Ankur Goenka >Assignee: Luke Cwik >Priority: P3 > Attachments: d44b7eda9e4c_java_server_logs.logs.gz, > d44b7eda9e4c_python_client_logs.log.bz2 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Data channel and logging channel are failing after some time with 1GB input > data for chicago taxi. > > E1218 02:44:02.837680206 72 chttp2_transport.cc:1148] Received a GOAWAY with > error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings" > Exception in thread read_grpc_client_inputs: > Traceback (most recent call last): > File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner > self.run() > File "/usr/local/lib/python2.7/threading.py", line 754, in run > self.__target(*self.__args, **self.__kwargs) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 273, in > target=lambda: self._read_inputs(elements_iterator), > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 145, in _execute > response = task() > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 180, in > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 269, in process_bundle > bundle_processor.process_bundle(instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in process_bundle > instruction_id, expected_targets): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 209, in input_elements > raise_(t, v, tb) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 145, in _execute > response = task() > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 180, in > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 269, in process_bundle > bundle_processor.process_bundle(instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in process_bundle > instruction_id, expected_targets): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 209, in input_elements > raise_(t, v, tb) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/loc
[jira] [Created] (BEAM-14556) Honor custom formatters being installed on the root logging handler
Luke Cwik created BEAM-14556: Summary: Honor custom formatters being installed on the root logging handler Key: BEAM-14556 URL: https://issues.apache.org/jira/browse/BEAM-14556 Project: Beam Issue Type: New Feature Components: runner-dataflow, sdk-java-harness Reporter: Luke Cwik Assignee: Luke Cwik This will allow users to create custom formatters integrating with an MDC if they so choose by using a JvmInitializer to install their custom formatter on the root logging handler. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-14539) JulHandlerPrintStream failing to buffer carry over bytes
[ https://issues.apache.org/jira/browse/BEAM-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14539: - Resolution: Fixed Status: Resolved (was: Open) > JulHandlerPrintStream failing to buffer carry over bytes > > > Key: BEAM-14539 > URL: https://issues.apache.org/jira/browse/BEAM-14539 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.33.0, 2.34.0, 2.35.0, 2.36.0, 2.37.0, 2.38.0, 2.39.0 >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.40.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > JulHandlerPrintStream was not flushing in a loop for large byte arrays which > meant that the carry over could be much larger then the assumed 6 bytes. > Saw these logs for a job: > {noformat} > java.lang.ArrayIndexOutOfBoundsException: Index 9 out of bounds for length 6 > at > org.apache.beam.runners.dataflow.worker.logging.JulHandlerPrintStreamAdapterFactory$JulHandlerPrintStream.write(JulHandlerPrintStreamAdapterFactory.java:142) > at java.base/java.io.PrintStream.write(PrintStream.java:559) > at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) > at java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312) > at java.base/sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) > at > java.base/java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:184) > at java.base/java.io.PrintStream.write(PrintStream.java:606) > at java.base/java.io.PrintStream.print(PrintStream.java:745) > at java.base/java.io.PrintStream.append(PrintStream.java:1147) > at java.base/java.io.PrintStream.append(PrintStream.java:1188) > at java.base/java.io.PrintStream.append(PrintStream.java:63) > at java.base/java.util.Formatter$FixedString.print(Formatter.java:2754) > at java.base/java.util.Formatter.format(Formatter.java:2661) > at java.base/java.io.PrintStream.format(PrintStream.java:1053) > at java.base/java.io.PrintStream.printf(PrintStream.java:949) > ... > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-14539) JulHandlerPrintStream failing to buffer carry over bytes
[ https://issues.apache.org/jira/browse/BEAM-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14539: - Description: JulHandlerPrintStream was not flushing in a loop for large byte arrays which meant that the carry over could be much larger then the assumed 6 bytes. Saw these logs for a job: {noformat} java.lang.ArrayIndexOutOfBoundsException: Index 9 out of bounds for length 6 at org.apache.beam.runners.dataflow.worker.logging.JulHandlerPrintStreamAdapterFactory$JulHandlerPrintStream.write(JulHandlerPrintStreamAdapterFactory.java:142) at java.base/java.io.PrintStream.write(PrintStream.java:559) at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) at java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312) at java.base/sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) at java.base/java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:184) at java.base/java.io.PrintStream.write(PrintStream.java:606) at java.base/java.io.PrintStream.print(PrintStream.java:745) at java.base/java.io.PrintStream.append(PrintStream.java:1147) at java.base/java.io.PrintStream.append(PrintStream.java:1188) at java.base/java.io.PrintStream.append(PrintStream.java:63) at java.base/java.util.Formatter$FixedString.print(Formatter.java:2754) at java.base/java.util.Formatter.format(Formatter.java:2661) at java.base/java.io.PrintStream.format(PrintStream.java:1053) at java.base/java.io.PrintStream.printf(PrintStream.java:949) ... {noformat} was: JulHandlerPrintStream was not flushing in a loop for large byte arrays which meant that the carry over could be much larger then the assumed 6 bytes. Saw these logs for a job: {noformat} 2022-05-26 07:40:30.284 PDT at org.apache.beam.runners.dataflow.worker.logging.JulHandlerPrintStreamAdapterFactory$JulHandlerPrintStream.write(JulHandlerPrintStreamAdapterFactory.java:142) 2022-05-26 07:40:30.284 PDT at java.base/java.io.PrintStream.write(PrintStream.java:559) 2022-05-26 07:40:30.284 PDT at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) 2022-05-26 07:40:30.285 PDT at java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312) 2022-05-26 07:40:30.286 PDT at java.base/sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) 2022-05-26 07:40:30.286 PDT at java.base/java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:184) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.write(PrintStream.java:606) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.print(PrintStream.java:745) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.append(PrintStream.java:1147) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.append(PrintStream.java:1188) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.append(PrintStream.java:63) 2022-05-26 07:40:30.286 PDT at java.base/java.util.Formatter$FixedString.print(Formatter.java:2754) 2022-05-26 07:40:30.286 PDT at java.base/java.util.Formatter.format(Formatter.java:2661) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.format(PrintStream.java:1053) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.printf(PrintStream.java:949) {noformat} > JulHandlerPrintStream failing to buffer carry over bytes > > > Key: BEAM-14539 > URL: https://issues.apache.org/jira/browse/BEAM-14539 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.33.0, 2.34.0, 2.35.0, 2.36.0, 2.37.0, 2.38.0, 2.39.0 >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.40.0 > > > JulHandlerPrintStream was not flushing in a loop for large byte arrays which > meant that the carry over could be much larger then the assumed 6 bytes. > Saw these logs for a job: > {noformat} > java.lang.ArrayIndexOutOfBoundsException: Index 9 out of bounds for length 6 > at > org.apache.beam.runners.dataflow.worker.logging.JulHandlerPrintStreamAdapterFactory$JulHandlerPrintStream.write(JulHandlerPrintStreamAdapterFactory.java:142) > at java.base/java.io.PrintStream.write(PrintStream.java:559) > at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) > at java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312) > at java.base/sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) > at > java.base/java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:184) > at java.base/java.io.PrintStream.write(PrintStream.java:606) > at java.base/java.io.PrintStream.print(PrintStream.java:745) > at java.base/java.io.PrintStream.append(PrintStream.java:1147) > at java.base/java.io.PrintStream.append(PrintStream.java:1188) > at java.base/java.io.PrintStream.append(PrintStream.java:63) > at java.base/java.ut
[jira] [Created] (BEAM-14539) JulHandlerPrintStream failing to buffer carry over bytes
Luke Cwik created BEAM-14539: Summary: JulHandlerPrintStream failing to buffer carry over bytes Key: BEAM-14539 URL: https://issues.apache.org/jira/browse/BEAM-14539 Project: Beam Issue Type: Bug Components: runner-dataflow Affects Versions: 2.39.0, 2.38.0, 2.37.0, 2.36.0, 2.35.0, 2.34.0, 2.33.0 Reporter: Luke Cwik Assignee: Luke Cwik Fix For: 2.40.0 JulHandlerPrintStream was not flushing in a loop for large byte arrays which meant that the carry over could be much larger then the assumed 6 bytes. Saw these logs for a job: {noformat} 2022-05-26 07:40:30.284 PDT at org.apache.beam.runners.dataflow.worker.logging.JulHandlerPrintStreamAdapterFactory$JulHandlerPrintStream.write(JulHandlerPrintStreamAdapterFactory.java:142) 2022-05-26 07:40:30.284 PDT at java.base/java.io.PrintStream.write(PrintStream.java:559) 2022-05-26 07:40:30.284 PDT at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) 2022-05-26 07:40:30.285 PDT at java.base/sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:312) 2022-05-26 07:40:30.286 PDT at java.base/sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104) 2022-05-26 07:40:30.286 PDT at java.base/java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:184) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.write(PrintStream.java:606) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.print(PrintStream.java:745) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.append(PrintStream.java:1147) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.append(PrintStream.java:1188) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.append(PrintStream.java:63) 2022-05-26 07:40:30.286 PDT at java.base/java.util.Formatter$FixedString.print(Formatter.java:2754) 2022-05-26 07:40:30.286 PDT at java.base/java.util.Formatter.format(Formatter.java:2661) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.format(PrintStream.java:1053) 2022-05-26 07:40:30.286 PDT at java.base/java.io.PrintStream.printf(PrintStream.java:949) {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-6258) Data channel failing after some time for 1G data input
[ https://issues.apache.org/jira/browse/BEAM-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543078#comment-17543078 ] Luke Cwik commented on BEAM-6258: - The underlying issue was fixed in gRPC c-core and a min version update to 1.33.1 will ensure that this no longer happens since it will contain https://github.com/grpc/grpc/commit/6e1655447ab2146a643114687d7916249bfdf018 which is the fix for https://github.com/grpc/grpc-java/issues/5188 > Data channel failing after some time for 1G data input > -- > > Key: BEAM-6258 > URL: https://issues.apache.org/jira/browse/BEAM-6258 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness >Reporter: Ankur Goenka >Priority: P3 > Attachments: d44b7eda9e4c_java_server_logs.logs.gz, > d44b7eda9e4c_python_client_logs.log.bz2 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Data channel and logging channel are failing after some time with 1GB input > data for chicago taxi. > > E1218 02:44:02.837680206 72 chttp2_transport.cc:1148] Received a GOAWAY with > error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings" > Exception in thread read_grpc_client_inputs: > Traceback (most recent call last): > File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner > self.run() > File "/usr/local/lib/python2.7/threading.py", line 754, in run > self.__target(*self.__args, **self.__kwargs) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 273, in > target=lambda: self._read_inputs(elements_iterator), > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 145, in _execute > response = task() > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 180, in > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 269, in process_bundle > bundle_processor.process_bundle(instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in process_bundle > instruction_id, expected_targets): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 209, in input_elements > raise_(t, v, tb) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > return self._next() > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 338, in > _next > raise self > _Rendezvous: <_Rendezvous of RPC that terminated with > (StatusCode.RESOURCE_EXHAUSTED, GOAWAY received)> > Traceback (most recent call last): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 145, in _execute > response = task() > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 180, in > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 253, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 269, in process_bundle > bundle_processor.process_bundle(instruction_id) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in process_bundle > instruction_id, expected_targets): > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 209, in input_elements > raise_(t, v, tb) > File > "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/data_plane.py", > line 260, in _read_inputs > for elements in elements_iterator: > File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 347, in > next > retur
[jira] [Updated] (BEAM-14496) PrecombineGroupingTable not inheriting a values output
[ https://issues.apache.org/jira/browse/BEAM-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14496: - Resolution: Fixed Status: Resolved (was: Open) > PrecombineGroupingTable not inheriting a values output > -- > > Key: BEAM-14496 > URL: https://issues.apache.org/jira/browse/BEAM-14496 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.40.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > The precombine grouping table needs to produce an output with a timestamp > from one of the input records. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-14298) Can't resolve org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde
[ https://issues.apache.org/jira/browse/BEAM-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14298: - Fix Version/s: Not applicable Assignee: Yi Hu (was: Kenneth Knowles) Resolution: Fixed Status: Resolved (was: Open) > Can't resolve org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde > --- > > Key: BEAM-14298 > URL: https://issues.apache.org/jira/browse/BEAM-14298 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Liam Miller-Cushon >Assignee: Yi Hu >Priority: P0 > Labels: currently-failing > Fix For: Not applicable > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I am trying to run the 'Linkage checker' documented at > [https://cwiki.apache.org/confluence/display/BEAM/Dependency+Upgrades#DependencyUpgrades-Linkagecheckeranalysis] > > I am seeing errors trying to resolve > org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde. There was a previous > report of the same dep not being available in standard maven repos > (https://issues.apache.org/jira/browse/BEAM-11689), which was marked fixed, > but I'm still seeing something similar. > > {{./gradlew -Ppublishing -PjavaLinkageArtifactIds=beam-sdks-java-core > :checkJavaLinkage}} > {{...}} > {{* What went wrong:}} > {{Execution failed for task ':sdks:java:io:hcatalog:compileJava'.}} > {{> Could not resolve all files for configuration > ':sdks:java:io:hcatalog:compileClasspath'.}} > {{ > Could not find org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde.}} > {{ Searched in the following locations:}} > {{ - > file:~/src/beam/sdks/java/io/hcatalog/offline-repository/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ - > https://repo.maven.apache.org/maven2/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ - > file:~/.m2/repository/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ - > https://public.nexus.pentaho.org/repository/proxy-public-3rd-party-release/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ - > https://oss.sonatype.org/content/repositories/staging/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ - > https://repository.apache.org/snapshots/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ - > https://repository.apache.org/content/repositories/releases/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom}} > {{ Required by:}} > {{ project :sdks:java:io:hcatalog > org.apache.hive:hive-exec:2.1.0 > > org.apache.calcite:calcite-core:1.6.0}} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (BEAM-14496) PrecombineGroupingTable not inheriting a values output
Luke Cwik created BEAM-14496: Summary: PrecombineGroupingTable not inheriting a values output Key: BEAM-14496 URL: https://issues.apache.org/jira/browse/BEAM-14496 Project: Beam Issue Type: Bug Components: sdk-java-harness Reporter: Luke Cwik Assignee: Luke Cwik Fix For: 2.40.0 The precombine grouping table needs to produce an output with a timestamp from one of the input records. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-3576) protos should import one another via globally valid paths
[ https://issues.apache.org/jira/browse/BEAM-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-3576: Fix Version/s: 2.39.0 Assignee: Milan Patel Resolution: Duplicate Status: Resolved (was: Open) > protos should import one another via globally valid paths > - > > Key: BEAM-3576 > URL: https://issues.apache.org/jira/browse/BEAM-3576 > Project: Beam > Issue Type: Sub-task > Components: beam-model >Reporter: Kenneth Knowles >Assignee: Milan Patel >Priority: P3 > Fix For: 2.39.0 > > > Right now the proto sub-bits depend on each other via jar since they build > via maven. In those jars, the proto files themselves are dumped in the root. > They should be placed in at least "beam/whatever.proto" or possibly > "apache/beam/whatever.proto". This will make the proto files valid in a > global namespace (like Java and Go imports and, I dare say, all well thought > out import mechanisms) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-13667) Users cannot provide their own sharding function when using FileIO
[ https://issues.apache.org/jira/browse/BEAM-13667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539913#comment-17539913 ] Luke Cwik commented on BEAM-13667: -- Really it would be best if support for dynamic sharing/GroupIntoBatches was added to the FlinkRunner. This would remove the need for specifying sharing functions. > Users cannot provide their own sharding function when using FileIO > -- > > Key: BEAM-13667 > URL: https://issues.apache.org/jira/browse/BEAM-13667 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: 2.35.0 >Reporter: Sandeep Kathula >Priority: P3 > > Beam uses RandomShardingFunction > ([https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L834]) > by default for sharding when using FileIO.write(). > > RandomShardingFuncction doesn’t work well with Flink Runner. Its assigning > same key (hashDestination(destination, destinationCoder)) along with the > shard number. > ([https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L863]) > This is causing different ShardedKeys going to same task slot. As a result, > there is no equal distribution of files written from different task slots. > E.g. 2 files are written by task slot 1, 3 files written by task slot 2, 0 > files written by other task slots. > > As a result, we are seeing Out of Memory from the pods writing more files. > > At > [https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L695-L698] > There is an option to give a different sharding function. But there is no way > for the user to mention different sharding function when using FileIO class. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-14429) SyntheticUnboundedSource(with SDF) produce duplicate records when split with DEFAULT_DESIRED_NUM_SPLITS
[ https://issues.apache.org/jira/browse/BEAM-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534506#comment-17534506 ] Luke Cwik commented on BEAM-14429: -- It is used by portable runners (e.g. dataflow runner V2) for all unbounded sources (e.g. non SDF Kafka). > SyntheticUnboundedSource(with SDF) produce duplicate records when split with > DEFAULT_DESIRED_NUM_SPLITS > --- > > Key: BEAM-14429 > URL: https://issues.apache.org/jira/browse/BEAM-14429 > Project: Beam > Issue Type: Bug > Components: io-common >Reporter: Yichi Zhang >Assignee: John Casey >Priority: P1 > Fix For: 2.39.0 > > Time Spent: 2h > Remaining Estimate: 0h > > With the default 20 split, the num records produced by > Read.from(SyntheticUnboundedSource) is always larger than the numRecords > specified. the more splits the more actual number records produced is off. > And the Read step tends to take longer time with more splits. > [https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L512] > The issue is manifested with java LoadTests on dataflow runner v2. > Initial suspicion is that duplicate source readers for the same restriction > and checkpoint were created by multiple UnboundedSourceAsSDFWrapperFns. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-14429) SyntheticUnboundedSource(with SDF) produce duplicate records when split with DEFAULT_DESIRED_NUM_SPLITS
[ https://issues.apache.org/jira/browse/BEAM-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534410#comment-17534410 ] Luke Cwik commented on BEAM-14429: -- Can you [fix this log statement|https://github.com/apache/beam/blob/5c21fbccec5e1e831dd0040bd7f631c050865430/sdks/java/io/synthetic/src/main/java/org/apache/beam/sdk/io/synthetic/SyntheticUnboundedSource.java#L112] so we can see the result of the initial split. All I see right now is: {noformat} Split into 20 bundles of sizes: [org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@1a0e6e3e, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@3d060183, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@413f3849, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@17e62ad8, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@7ef0d984, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@73f7d5c0, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@4b324687, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@6d2405d3, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@1560cd0a, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@587443b3, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@12b3044, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@4c6ca72f, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@5f773b35, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@25e482b3, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@7d152f5b, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@5469ba09, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@32219d22, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@335170cf, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@676afe09, org.apache.beam.sdk.io.synthetic.SyntheticUnboundedSource@7e4aa91f] {noformat} > SyntheticUnboundedSource(with SDF) produce duplicate records when split with > DEFAULT_DESIRED_NUM_SPLITS > --- > > Key: BEAM-14429 > URL: https://issues.apache.org/jira/browse/BEAM-14429 > Project: Beam > Issue Type: Bug > Components: io-common >Reporter: Yichi Zhang >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 2h > Remaining Estimate: 0h > > With the default 20 split, the num records produced by > Read.from(SyntheticUnboundedSource) is always larger than the numRecords > specified. the more splits the more actual number records produced is off. > And the Read step tends to take longer time with more splits. > [https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L512] > The issue is manifested with java LoadTests on dataflow runner v2. > Initial suspicion is that duplicate source readers for the same restriction > and checkpoint were created by multiple UnboundedSourceAsSDFWrapperFns. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-14064) ElasticSearchIO#Write buffering and outputting across windows
[ https://issues.apache.org/jira/browse/BEAM-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14064: - Resolution: Fixed Status: Resolved (was: In Progress) > ElasticSearchIO#Write buffering and outputting across windows > - > > Key: BEAM-14064 > URL: https://issues.apache.org/jira/browse/BEAM-14064 > Project: Beam > Issue Type: Bug > Components: io-java-elasticsearch >Affects Versions: 2.35.0, 2.36.0, 2.37.0 >Reporter: Luke Cwik >Assignee: Evan Galpin >Priority: P1 > Fix For: 2.39.0 > > Time Spent: 7h 10m > Remaining Estimate: 0h > > Source: https://lists.apache.org/thread/mtwtno2o88lx3zl12jlz7o5w1lcgm2db > Bug PR: https://github.com/apache/beam/pull/15381 > ElasticsearchIO is collecting results from elements in window X and then > trying to output them in window Y when flushing the batch. This exposed a bug > where elements that were being buffered were being output as part of a > different window than what the window that produced them was. > This became visible because validation was added recently to ensure that when > the pipeline is processing elements in window X that output with a timestamp > is valid for window X. Note that this validation only occurs in > *@ProcessElement* since output is associated with the current window with the > input element that is being processed. > It is ok to do this in *@FinishBundle* since there is no existing windowing > context and when you output that element is assigned to an appropriate window. > *Further Context* > We’ve bisected it to being introduced in 2.35.0, and I’m reasonably certain > it’s this PR https://github.com/apache/beam/pull/15381 > Our scenario is pretty trivial, we read off Pubsub and write to Elastic in a > streaming job, the config for the source and sink is respectively > {noformat} > pipeline.apply( > PubsubIO.readStrings().fromSubscription(subscription) > ).apply(ParseJsons.of(OurObject::class.java)) > .setCoder(KryoCoder.of()) > {noformat} > and > {noformat} > ElasticsearchIO.write() > .withUseStatefulBatches(true) > .withMaxParallelRequestsPerWindow(1) > .withMaxBufferingDuration(Duration.standardSeconds(30)) > // 5 bytes **> KiB **> MiB, so 5 MiB > .withMaxBatchSizeBytes(5L * 1024 * 1024) > // # of docs > .withMaxBatchSize(1000) > .withConnectionConfiguration( > ElasticsearchIO.ConnectionConfiguration.create( > arrayOf(host), > "fubar", > "_doc" > ).withConnectTimeout(5000) > .withSocketTimeout(3) > ) > .withRetryConfiguration( > ElasticsearchIO.RetryConfiguration.create( > 10, > // the duration is wall clock, against the connection and > socket timeouts specified > // above. I.e., 10 x 30s is gonna be more than 3 minutes, > so if we're getting > // 10 socket timeouts in a row, this would ignore the > "10" part and terminate > // after 6. The idea is that in a mixed failure mode, > you'd get different timeouts > // of different durations, and on average 10 x fails < 4m. > // That said, 4m is arbitrary, so adjust as and when > needed. > Duration.standardMinutes(4) > ) > ) > .withIdFn { f: JsonNode -> f["id"].asText() } > .withIndexFn { f: JsonNode -> f["schema_name"].asText() } > .withIsDeleteFn { f: JsonNode -> f["_action"].asText("noop") == > "delete" } > {noformat} > We recently tried upgrading 2.33 to 2.36 and immediately hit a bug in the > consumer, due to alleged time skew, specifically > {noformat} > 2022-03-07 10:48:37.886 GMTError message from worker: > java.lang.IllegalArgumentException: Cannot output with timestamp > 2022-03-07T10:43:38.640Z. Output timestamps must be no earlier than the > timestamp of the > current input (2022-03-07T10:43:43.562Z) minus the allowed skew (0 > milliseconds) and no later than 294247-01-10T04:00:54.775Z. See the > DoFn#getAllowedTimestampSkew() Javadoc > for details on changing the allowed skew. > org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:446) > > org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:422) > > org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn
[jira] [Assigned] (BEAM-14387) DirectRunner does not update reference to currentRestriction when running in SDF
[ https://issues.apache.org/jira/browse/BEAM-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-14387: Assignee: Pablo Estrada (was: Luke Cwik) > DirectRunner does not update reference to currentRestriction when running in > SDF > > > Key: BEAM-14387 > URL: https://issues.apache.org/jira/browse/BEAM-14387 > Project: Beam > Issue Type: Bug > Components: runner-direct >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: P2 > > I have an SDF implementation that looks like so: > > {code:java} > class MyRestrictionTracker { > MyRestriction restriction; > currentRestriction() { return restriction; } > tryClaim(MyPosition position) { > this.restriction = new MyRestriction(position) > } > }{code} > I ran this on the DirectRunner, and the restriction would never advance: It > would get stuck on the very first value. > I also ran this on DataflowRunner, and the problem did not exist there: This > ran fine. > > I was able to fix this on the DirectRunner (it works well on Dataflow as > well) by changing the restriction to be mutable. Something like this: > > {code:java} > class MyRestrictionTracker { > MyRestriction restriction; > currentRestriction() { return restriction; } > tryClaim(MyPosition position) { > this.restriction.position = position; > } > }{code} > This looks like an execution issue with SDF on DirectRunner: The DirectRunner > is likely storing a reference to `currentRestriction()` and never updating it > as it runs. > > I'm happy to fix this on the DirectRunner - I would just like to find > pointers : ) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-14387) DirectRunner does not update reference to currentRestriction when running in SDF
[ https://issues.apache.org/jira/browse/BEAM-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530321#comment-17530321 ] Luke Cwik commented on BEAM-14387: -- Run a debugger and start tracing from https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/SplittableProcessElementsEvaluatorFactory.java Note that this will enter the SplittableProcessElementsInvoker which is shared implementation across Java based runners (e.g. dataflow v1, flink). I would check that the current bundle is being checkpointed since that is the only time the restriction advances from the perspective of the runner. > DirectRunner does not update reference to currentRestriction when running in > SDF > > > Key: BEAM-14387 > URL: https://issues.apache.org/jira/browse/BEAM-14387 > Project: Beam > Issue Type: Bug > Components: runner-direct >Reporter: Pablo Estrada >Assignee: Luke Cwik >Priority: P2 > > I have an SDF implementation that looks like so: > > {code:java} > class MyRestrictionTracker { > MyRestriction restriction; > currentRestriction() { return restriction; } > tryClaim(MyPosition position) { > this.restriction = new MyRestriction(position) > } > }{code} > I ran this on the DirectRunner, and the restriction would never advance: It > would get stuck on the very first value. > I also ran this on DataflowRunner, and the problem did not exist there: This > ran fine. > > I was able to fix this on the DirectRunner (it works well on Dataflow as > well) by changing the restriction to be mutable. Something like this: > > {code:java} > class MyRestrictionTracker { > MyRestriction restriction; > currentRestriction() { return restriction; } > tryClaim(MyPosition position) { > this.restriction.position = position; > } > }{code} > This looks like an execution issue with SDF on DirectRunner: The DirectRunner > is likely storing a reference to `currentRestriction()` and never updating it > as it runs. > > I'm happy to fix this on the DirectRunner - I would just like to find > pointers : ) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-14184) DirectStreamObserver does not respect channel isReady
[ https://issues.apache.org/jira/browse/BEAM-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14184: - Fix Version/s: 2.39.0 Resolution: Fixed Status: Resolved (was: Open) > DirectStreamObserver does not respect channel isReady > - > > Key: BEAM-14184 > URL: https://issues.apache.org/jira/browse/BEAM-14184 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.39.0 > > > Leads to OOMs like: > {noformat} > Output channel stalled for 1023s, outbound thread CHAIN MapPartition > (MapPartition at [1]PerformInference) -> FlatMap (FlatMap at > ExtractOutput[0]) -> Map (Key Extractor) -> GroupCombine (GroupCombine at > GroupCombine: > PerformInferenceAndCombineResults_dep_049/GroupPredictionsByImage) -> Map > (Key Extractor) (1/1). See: https://issues.apache.org/jira/browse/BEAM-4280 > for the history for this issue. > Feb 18, 2022 11:51:05 AM > org.apache.beam.vendor.grpc.v1p36p0.io.grpc.netty.NettyServerTransport > notifyTerminated > INFO: Transport failed > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.OutOfDirectMemoryError: > failed to allocate 2097152 byte(s) of direct memory (used: 1205862679, max: > 1207959552) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:754) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:709) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:621) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:204) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:188) > {noformat} > See more context in > https://lists.apache.org/thread/llmxodbmczhn10c98prs8wmd5hy4nvff -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-14187) NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14187: - Resolution: Fixed Status: Resolved (was: Triage Needed) Follow-up NPE was fixed by https://github.com/apache/beam/pull/17454 > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Minbo Bae >Assignee: Minbo Bae >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-14187) NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528414#comment-17528414 ] Luke Cwik commented on BEAM-14187: -- There is a consistent NullPointerException now occurring at https://github.com/apache/beam/blob/8ecd51397550bff26f1a245dc4afd4ae6ea4c046/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L554 Optional.of has to be changed to Optional.fromNullable > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Minbo Bae >Assignee: Minbo Bae >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-14187) NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528415#comment-17528415 ] Luke Cwik commented on BEAM-14187: -- This needs to be fixed for 2.39 release otherwise Dataflow bounded pipelines with side inputs will likely fail. > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Minbo Bae >Assignee: Minbo Bae >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Reopened] (BEAM-14187) NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reopened BEAM-14187: -- > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Minbo Bae >Assignee: Minbo Bae >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (BEAM-13608) Dynamic Topics management
[ https://issues.apache.org/jira/browse/BEAM-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13608: - Fix Version/s: 2.39.0 Resolution: Fixed Status: Resolved (was: Open) > Dynamic Topics management > - > > Key: BEAM-13608 > URL: https://issues.apache.org/jira/browse/BEAM-13608 > Project: Beam > Issue Type: Improvement > Components: io-java-jms >Reporter: Vincent BALLADA >Assignee: Vincent BALLADA >Priority: P2 > Labels: assigned: > Fix For: 2.39.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > JmsIO write function is able to publish messages to topics with static names: > company/employee/id/1234567. > Some AMQP/JMS broker provides the ability to publish to dynamic topics like: > company/employee/id/\{employeeId} > If we want to handle that with Apache Beam JmsIO, we must create a branch per > employeeId, which is not suitable for a company with thousand of employee, or > other similat use cases. > The JmsIO write function should provide the ability to handle dynamic topics. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-14184) DirectStreamObserver does not respect channel isReady
[ https://issues.apache.org/jira/browse/BEAM-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521697#comment-17521697 ] Luke Cwik commented on BEAM-14184: -- Also, if your willing to review the fix, that would help get it submitted sooner. > DirectStreamObserver does not respect channel isReady > - > > Key: BEAM-14184 > URL: https://issues.apache.org/jira/browse/BEAM-14184 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > > Leads to OOMs like: > {noformat} > Output channel stalled for 1023s, outbound thread CHAIN MapPartition > (MapPartition at [1]PerformInference) -> FlatMap (FlatMap at > ExtractOutput[0]) -> Map (Key Extractor) -> GroupCombine (GroupCombine at > GroupCombine: > PerformInferenceAndCombineResults_dep_049/GroupPredictionsByImage) -> Map > (Key Extractor) (1/1). See: https://issues.apache.org/jira/browse/BEAM-4280 > for the history for this issue. > Feb 18, 2022 11:51:05 AM > org.apache.beam.vendor.grpc.v1p36p0.io.grpc.netty.NettyServerTransport > notifyTerminated > INFO: Transport failed > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.OutOfDirectMemoryError: > failed to allocate 2097152 byte(s) of direct memory (used: 1205862679, max: > 1207959552) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:754) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:709) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:621) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:204) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:188) > {noformat} > See more context in > https://lists.apache.org/thread/llmxodbmczhn10c98prs8wmd5hy4nvff -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14184) DirectStreamObserver does not respect channel isReady
[ https://issues.apache.org/jira/browse/BEAM-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521695#comment-17521695 ] Luke Cwik commented on BEAM-14184: -- https://github.com/apache/beam/pull/17358 is out for review with a fix that should make it into the next release. You need to set the experiments pipeline option. You can do this during pipeline construction or on the command line when launching your pipeline via *--experiments=...*. See section 2.1 of the [programming guide|https://beam.apache.org/documentation/programming-guide/] for more details on setting pipeline options. > DirectStreamObserver does not respect channel isReady > - > > Key: BEAM-14184 > URL: https://issues.apache.org/jira/browse/BEAM-14184 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > > Leads to OOMs like: > {noformat} > Output channel stalled for 1023s, outbound thread CHAIN MapPartition > (MapPartition at [1]PerformInference) -> FlatMap (FlatMap at > ExtractOutput[0]) -> Map (Key Extractor) -> GroupCombine (GroupCombine at > GroupCombine: > PerformInferenceAndCombineResults_dep_049/GroupPredictionsByImage) -> Map > (Key Extractor) (1/1). See: https://issues.apache.org/jira/browse/BEAM-4280 > for the history for this issue. > Feb 18, 2022 11:51:05 AM > org.apache.beam.vendor.grpc.v1p36p0.io.grpc.netty.NettyServerTransport > notifyTerminated > INFO: Transport failed > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.OutOfDirectMemoryError: > failed to allocate 2097152 byte(s) of direct memory (used: 1205862679, max: > 1207959552) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:754) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:709) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:621) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:204) > at > org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:188) > {noformat} > See more context in > https://lists.apache.org/thread/llmxodbmczhn10c98prs8wmd5hy4nvff -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14187) NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14187: - Fix Version/s: 2.39.0 Assignee: Minbo Bae Resolution: Fixed Status: Resolved (was: Triage Needed) > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Minbo Bae >Assignee: Minbo Bae >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519138#comment-17519138 ] Luke Cwik commented on BEAM-14192: -- Did a dataflow service release pick up your updated containers? > ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has > been compiled by a more recent version of the Java Runtime > -- > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow, test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The Dataflow ITs fails with a class version mismatch. I believe the Dataflow > v1 container that is being tested was built with the wrong JDK version. > Jenkins: > https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13939) Go SDK: Protobuf namespace conflict
[ https://issues.apache.org/jira/browse/BEAM-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13939: - Status: Resolved (was: Resolved) > Go SDK: Protobuf namespace conflict > --- > > Key: BEAM-13939 > URL: https://issues.apache.org/jira/browse/BEAM-13939 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go >Affects Versions: 2.36.0 >Reporter: Milan Patel >Assignee: Milan Patel >Priority: P2 > Fix For: 2.39.0 > > Attachments: demobug.zip > > Time Spent: 28h 40m > Remaining Estimate: 0h > > The Go SDK generated grpc protobufs are not namespaced with enough > granularity. If a user has another external dependency with the same protobuf > file registered with the proto runtime, their compiled binary will panic at > runtime pointing the user to this [doc > page|https://developers.google.com/protocol-buffers/docs/reference/go/faq#fix-namespace-conflict]. > > In the interim, following the instructions to add either ldflags to the > compiler or an environment var to the binary works, but this is an unideal > solution since only one of the duplicate proto specifications will be > accessible from a [global > registry|https://pkg.go.dev/google.golang.org/protobuf@v1.27.1/reflect/protoregistry]. > > Ask: Regenerate the go protos such that descriptors like > [these|https://github.com/apache/beam/blob/84353a7c973d3acaaa56d81c265dce7193a56be5/sdks/go/pkg/beam/model/pipeline_v1/metrics.pb.go#L797-L811] > are outputted with filenames that are more granular, such as a filename that > includes the directory structure of the repository. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (BEAM-13939) Go SDK: Protobuf namespace conflict
[ https://issues.apache.org/jira/browse/BEAM-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-13939: Assignee: Milan Patel > Go SDK: Protobuf namespace conflict > --- > > Key: BEAM-13939 > URL: https://issues.apache.org/jira/browse/BEAM-13939 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go >Affects Versions: 2.36.0 >Reporter: Milan Patel >Assignee: Milan Patel >Priority: P2 > Attachments: demobug.zip > > Time Spent: 28h 40m > Remaining Estimate: 0h > > The Go SDK generated grpc protobufs are not namespaced with enough > granularity. If a user has another external dependency with the same protobuf > file registered with the proto runtime, their compiled binary will panic at > runtime pointing the user to this [doc > page|https://developers.google.com/protocol-buffers/docs/reference/go/faq#fix-namespace-conflict]. > > In the interim, following the instructions to add either ldflags to the > compiler or an environment var to the binary works, but this is an unideal > solution since only one of the duplicate proto specifications will be > accessible from a [global > registry|https://pkg.go.dev/google.golang.org/protobuf@v1.27.1/reflect/protoregistry]. > > Ask: Regenerate the go protos such that descriptors like > [these|https://github.com/apache/beam/blob/84353a7c973d3acaaa56d81c265dce7193a56be5/sdks/go/pkg/beam/model/pipeline_v1/metrics.pb.go#L797-L811] > are outputted with filenames that are more granular, such as a filename that > includes the directory structure of the repository. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14187) NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519078#comment-17519078 ] Luke Cwik commented on BEAM-14187: -- Note that initializeBloomFilterAndIndexPerShard is synchronized which is the only method that mutates the indexPerShard variable. Since initializeBloomFilterAndIndexPerShard is invoked within overKeyComponents that means there is a memory barrier between the thread that created the indexPerShard and other threads ensuring that the object is safely published across threads. Also, the [Javadoc for HashMap|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html] says that concurrent reads are ok, its concurrent modifications that require synchronization: "If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally." > NullPointerException or IllegalStateException at IsmReaderImpl in Dataflow > -- > > Key: BEAM-14187 > URL: https://issues.apache.org/jira/browse/BEAM-14187 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Minbo Bae >Priority: P2 > Time Spent: 0.5h > Remaining Estimate: 0h > > h6. Problem > Dataflow Java batch jobs with large side input intermittently throws > {{NullPointerException}} or {{{}IllegalStateException{}}}. > * > [NullPointerException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/npe.png] > happens at > [IsmReaderImpl.overKeyComponents|https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L217]: > * > [IllegalStateException|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/IllegalStateException.png] > happens at [IsmReaderImpl. initializeForKeyedRead > |https://github.com/apache/beam/blob/v2.37.0/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/IsmReaderImpl.java#L500]. > (all error logs in the Dataflow job is > [here|https://gist.githubusercontent.com/baeminbo/459e283eadbc7752c9f23616b52d958a/raw/f0480b8eaff590fb3f3ae2ab98ddce7dd3b4a237/downloaded-logs-20220327-171955.json].) > h6. Hypothesis > The {{initializeForKeyedRead}} is not synchronized. Multiple threads can > enter the method so that initialize the index for the same shard and update > {{indexPerShard}} without synchronization. And, the {{overKeyComponents}} > also accesses {{indexPerShard}} without synchronization. As {{indexPerShard}} > is just a {{HashMap}} which is not thread-safe, it can cause > {{NullPointerException}} and {{IllegalStateException}} above. > h6. Suggestion > I think it can fix this issue if we change the type of {{indexPerShard}} to a > thread-safe map (e.g. {{ConcurrentHashMap}}). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14144) Record JFR profiles when GC thrashing is detected
[ https://issues.apache.org/jira/browse/BEAM-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17519001#comment-17519001 ] Luke Cwik commented on BEAM-14144: -- Of interest in that article was also the jdk-jfr-compat library that allowed us to compile against the API but I couldn't find a released version of it but maybe someone else has done the same thing already or we could also implement stub APIs to compile against since the JDK will provide it's implementation on the classpath earlier. > Record JFR profiles when GC thrashing is detected > - > > Key: BEAM-14144 > URL: https://issues.apache.org/jira/browse/BEAM-14144 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Steve Niemitz >Assignee: Steve Niemitz >Priority: P2 > Time Spent: 3h 20m > Remaining Estimate: 0h > > It'd be useful for debugging GC issues in jobs to have allocation profiles > when it was occurring. In java 9+, JFR is included in the JDK, we could use > that to record profiles when GC thrashing is detected. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (BEAM-14144) Record JFR profiles when GC thrashing is detected
[ https://issues.apache.org/jira/browse/BEAM-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518987#comment-17518987 ] Luke Cwik edited comment on BEAM-14144 at 4/7/22 4:07 PM: -- Note that OpenJDK 8u supports JFR. We already use the OpenJDK 8u as the base docker container for Apache Beam. https://developers.redhat.com/blog/2020/08/25/get-started-with-jdk-flight-recorder-in-openjdk-8u# was (Author: lcwik): Note that OpenJDK 8u supports JFR. We already use the OpenJDK 8u as the base docker container for Apache Beam. > Record JFR profiles when GC thrashing is detected > - > > Key: BEAM-14144 > URL: https://issues.apache.org/jira/browse/BEAM-14144 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Steve Niemitz >Assignee: Steve Niemitz >Priority: P2 > Time Spent: 3h 20m > Remaining Estimate: 0h > > It'd be useful for debugging GC issues in jobs to have allocation profiles > when it was occurring. In java 9+, JFR is included in the JDK, we could use > that to record profiles when GC thrashing is detected. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14144) Record JFR profiles when GC thrashing is detected
[ https://issues.apache.org/jira/browse/BEAM-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518987#comment-17518987 ] Luke Cwik commented on BEAM-14144: -- Note that OpenJDK 8u supports JFR. We already use the OpenJDK 8u as the base docker container for Apache Beam. > Record JFR profiles when GC thrashing is detected > - > > Key: BEAM-14144 > URL: https://issues.apache.org/jira/browse/BEAM-14144 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Steve Niemitz >Assignee: Steve Niemitz >Priority: P2 > Time Spent: 3h 20m > Remaining Estimate: 0h > > It'd be useful for debugging GC issues in jobs to have allocation profiles > when it was occurring. In java 9+, JFR is included in the JDK, we could use > that to record profiles when GC thrashing is detected. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14175) "Read loop was aborted" errors are very noisy when large jobs fail
[ https://issues.apache.org/jira/browse/BEAM-14175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14175: - Fix Version/s: 2.39.0 Resolution: Fixed Status: Resolved (was: Open) > "Read loop was aborted" errors are very noisy when large jobs fail > -- > > Key: BEAM-14175 > URL: https://issues.apache.org/jira/browse/BEAM-14175 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Steve Niemitz >Assignee: Steve Niemitz >Priority: P2 > Fix For: 2.39.0 > > Time Spent: 20m > Remaining Estimate: 0h > > When a large dataflow job fails or is cancelled, "Read loop was aborted." is > logged potentially 1000s of times as read operations are interrupted. These > could be instead logged at debug level rather than error to reduce noise in > the logs, they often hide actual useful logs. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14253) pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1 and V2
[ https://issues.apache.org/jira/browse/BEAM-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518156#comment-17518156 ] Luke Cwik commented on BEAM-14253: -- There is an internal bug with dataflow service where it's 0 (aka unix epoch). It is about to be fixed and rolled out. > pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1 and V2 > > > Key: BEAM-14253 > URL: https://issues.apache.org/jira/browse/BEAM-14253 > Project: Beam > Issue Type: Sub-task > Components: io-java-gcp, test-failures >Reporter: Daniel Oliveira >Assignee: Daniel Collins >Priority: P1 > > Example: > https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/ > {noformat} > java.lang.AssertionError: Did not receive signal on > projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574 > in 300s > {noformat} > Dataflow logs show this, might be related: > {noformat} > Error message from worker: java.lang.IllegalArgumentException > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > > org.apache.beam.sdk.io.gcp.pubsublite.internal.SubscriptionPartitionLoader$GeneratorFn.getInitialWatermarkEstimatorState(SubscriptionPartitionLoader.java:76) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-13519) Java precommit flaky (timing out)
[ https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515652#comment-17515652 ] Luke Cwik commented on BEAM-13519: -- testCachingOfClient definitely was getting stuck, fix in https://github.com/apache/beam/pull/17240 with a few more fixes related to threading as well. > Java precommit flaky (timing out) > - > > Key: BEAM-13519 > URL: https://issues.apache.org/jira/browse/BEAM-13519 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kyle Weaver >Assignee: Kiley Sok >Priority: P1 > Labels: flake > Time Spent: 1h > Remaining Estimate: 0h > > Java precommits are sometimes timing out with no clear cause. Gradle will log > a bunch of routine build tasks, and then Jenkins will abort the job much > later. There are no logs to indicate what happened. It is not even clear > which task or tasks, if any, was the culprit, since many tasks are run in > parallel. > 01:53:28 > Task :sdks:java:testing:nexmark:build > 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents > 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents > 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents > 01:53:28 > Task :sdks:java:extensions:sql:buildDependents > 01:53:28 > Task :sdks:java:io:kafka:buildDependents > 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents > 01:53:28 > Task :sdks:java:io:synthetic:buildDependents > 01:53:28 > Task :sdks:java:io:mongodb:buildDependents > 01:53:28 > Task :sdks:java:io:thrift:buildDependents > 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents > 01:53:28 > Task :sdks:java:expansion-service:buildDependents > 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents > 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents > 01:53:28 > Task :sdks:java:io:common:buildDependents > 01:53:28 > Task :runners:direct-java:buildDependents > 01:53:28 > Task :runners:local-java:buildDependents > 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted. > https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14177) GroupByKey iteration caching broken for portable runners like Dataflow runner v2
[ https://issues.apache.org/jira/browse/BEAM-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14177: - Resolution: Fixed Status: Resolved (was: Open) Yes, can close now that it is cherry picked. > GroupByKey iteration caching broken for portable runners like Dataflow runner > v2 > > > Key: BEAM-14177 > URL: https://issues.apache.org/jira/browse/BEAM-14177 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > The wrong cache key is being used as it has not been namespaced to the state > key. > This was previously being done within StateFetchingIterators but > https://github.com/apache/beam/pull/17121 changed that to use a single shared > key. > The fix is to subcache the cache before passing it into > StateFetchingIterators restoring the prior behavior. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513661#comment-17513661 ] Luke Cwik commented on BEAM-14192: -- b/224581228 was fixed internally, does the Dataflow container need to be rebuilt? > ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has > been compiled by a more recent version of the Java Runtime > -- > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > > The Dataflow ITs fails with a class version mismatch. I believe the Dataflow > v1 container that is being tested was built with the wrong JDK version. > Jenkins: > https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Reopened] (BEAM-12815) Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network buffers
[ https://issues.apache.org/jira/browse/BEAM-12815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reopened BEAM-12815: -- It looks like it is still failing on TextXLang_Multi. Example failure: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/5207/ > Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network > buffers > -- > > Key: BEAM-12815 > URL: https://issues.apache.org/jira/browse/BEAM-12815 > Project: Beam > Issue Type: Bug > Components: cross-language, sdk-go >Reporter: Daniel Oliveira >Assignee: Danny McCormick >Priority: P3 > Fix For: Not applicable > > > When running the cross-language test suites () Flink fails on TestXLang_Multi > with the following error: > {noformat} > 19:29:14 2021/08/27 02:29:14 (): java.io.IOException: Insufficient number of > network buffers: required 17, but only 16 available. The total number of > network buffers is currently set to 2048 of 32768 bytes each. You can > increase this number by setting the configuration keys > 'taskmanager.memory.network.fraction', 'taskmanager.memory.network.min', and > 'taskmanager.memory.network.max'. > 19:29:14 2021/08/27 02:29:14 Job state: FAILED > 19:29:14 --- FAIL: TestXLang_Multi (6.26s){noformat} > This doesn't seem to be a parallelism problem (go test is run with "-p 1" as > expected) and is only happening on this specific test. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14192: - Summary: ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime (was: IT run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime) > ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has > been compiled by a more recent version of the Java Runtime > -- > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > > The Dataflow ITs fails with a class version mismatch. I believe the Dataflow > v1 container that is being tested was built with the wrong JDK version. > Jenkins: > https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14192: - Component/s: runner-dataflow > ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has > been compiled by a more recent version of the Java Runtime > -- > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > > The Dataflow ITs fails with a class version mismatch. I believe the Dataflow > v1 container that is being tested was built with the wrong JDK version. > Jenkins: > https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14192) IT run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14192: - Description: The Dataflow ITs fails with a class version mismatch. I believe the Dataflow v1 container that is being tested was built with the wrong JDK version. Jenkins: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink Example Failure: {noformat} java.lang.RuntimeException: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) at org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) at org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) {noformat} was: The IT fails with a class version mismatch. This is odd since this implies that we are somehow getting a Apache Commons Logging jar that was compiled with Java 11. Note that I have only seen this in this IT and not others which is pointing at zetasketch classpath being broken in some way. My first thought was that we were pulling in a dependency with an incorrectly shaded apache-commons logging class but when I dumped the classpath and all the jar contents on the commons-logging-1.2.jar contained this class. First found failure: https://ci-beam.apache.org/job/beam_PostCommit_Java/8786/ Example Failure: {noformat} java.lang.RuntimeException: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) at org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) at org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) {noformat} > IT run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has > been compiled by a more recent version of the Java Runtime > - > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > > The Dataflow ITs fails with a class version mismatch. I believe the Dataflow > v1 container that is being tested was built with the wrong JDK version. > Jenkins: > https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (BEAM-14192) BigQueryHllSketchCompatibilityIT fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-14192: Assignee: Kiley Sok (was: Andrew Pilloud) > BigQueryHllSketchCompatibilityIT fails with > org/apache/commons/logging/LogFactory has been compiled by a more recent > version of the Java Runtime > > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > > The IT fails with a class version mismatch. This is odd since this implies > that we are somehow getting a Apache Commons Logging jar that was compiled > with Java 11. > Note that I have only seen this in this IT and not others which is pointing > at zetasketch classpath being broken in some way. > My first thought was that we were pulling in a dependency with an incorrectly > shaded apache-commons logging class but when I dumped the classpath and all > the jar contents on the commons-logging-1.2.jar contained this class. > First found failure: https://ci-beam.apache.org/job/beam_PostCommit_Java/8786/ > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14192) IT run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
[ https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14192: - Summary: IT run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime (was: BigQueryHllSketchCompatibilityIT fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime) > IT run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has > been compiled by a more recent version of the Java Runtime > - > > Key: BEAM-14192 > URL: https://issues.apache.org/jira/browse/BEAM-14192 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Luke Cwik >Assignee: Kiley Sok >Priority: P2 > > The IT fails with a class version mismatch. This is odd since this implies > that we are somehow getting a Apache Commons Logging jar that was compiled > with Java 11. > Note that I have only seen this in this IT and not others which is pointing > at zetasketch classpath being broken in some way. > My first thought was that we were pulling in a dependency with an incorrectly > shaded apache-commons logging class but when I dumped the classpath and all > the jar contents on the commons-logging-1.2.jar contained this class. > First found failure: https://ci-beam.apache.org/job/beam_PostCommit_Java/8786/ > Example Failure: > {noformat} > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.beam.sdk.util.UserCodeException: > java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory > has been compiled by a more recent version of the Java Runtime (class file > version 55.0), this version of the Java Runtime only recognizes class file > versions up to 52.0 > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) > at > org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14193) SpannerChangeStreamOrderedWithinKeyGloballyIT and SpannerChangeStreamOrderedWithinKeyIT fail on Direct Runner
Luke Cwik created BEAM-14193: Summary: SpannerChangeStreamOrderedWithinKeyGloballyIT and SpannerChangeStreamOrderedWithinKeyIT fail on Direct Runner Key: BEAM-14193 URL: https://issues.apache.org/jira/browse/BEAM-14193 Project: Beam Issue Type: Bug Components: runner-direct, test-failures Reporter: Luke Cwik Assignee: Pablo Estrada Like BEAM-14151, it seems as though DirectRunner fails on these tests pretty consistently: {noformat} org.apache.beam.sdk.io.gcp.spanner.changestreams.it.SpannerChangeStreamOrderedWithinKeyGloballyIT.testOrderedWithinKey org.apache.beam.sdk.io.gcp.spanner.changestreams.it.SpannerChangeStreamOrderedWithinKeyIT.testOrderedWithinKey {noformat} This has been failing for a few days, occassionally it passes but that could be a fluke that the direct runner executed it in the order that was expected. Example failure: https://ci-beam.apache.org/job/beam_PostCommit_Java/8776/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14192) BigQueryHllSketchCompatibilityIT fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime
Luke Cwik created BEAM-14192: Summary: BigQueryHllSketchCompatibilityIT fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime Key: BEAM-14192 URL: https://issues.apache.org/jira/browse/BEAM-14192 Project: Beam Issue Type: Bug Components: test-failures Reporter: Luke Cwik Assignee: Andrew Pilloud The IT fails with a class version mismatch. This is odd since this implies that we are somehow getting a Apache Commons Logging jar that was compiled with Java 11. Note that I have only seen this in this IT and not others which is pointing at zetasketch classpath being broken in some way. My first thought was that we were pulling in a dependency with an incorrectly shaded apache-commons logging class but when I dumped the classpath and all the jar contents on the commons-logging-1.2.jar contained this class. First found failure: https://ci-beam.apache.org/job/beam_PostCommit_Java/8786/ Example Failure: {noformat} java.lang.RuntimeException: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187) at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108) at org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56) at org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14191) CrossLanguageJdbcIOTest broken with "Cannot load JDBC driver class 'com.mysql.cj.jdbc.Driver'"
Luke Cwik created BEAM-14191: Summary: CrossLanguageJdbcIOTest broken with "Cannot load JDBC driver class 'com.mysql.cj.jdbc.Driver'" Key: BEAM-14191 URL: https://issues.apache.org/jira/browse/BEAM-14191 Project: Beam Issue Type: Bug Components: cross-language, test-failures Reporter: Luke Cwik Assignee: Chamikara Madhusanka Jayalath Fix For: 2.38.0 Example failure: https://ci-beam.apache.org/job/beam_PostCommit_Python36/5113/testReport/junit/apache_beam.io.external.xlang_jdbcio_it_test/CrossLanguageJdbcIOTest/test_xlang_jdbc_read_1_mysql/ Culprit: https://github.com/apache/beam/pull/17167 {noformat} RuntimeError: Pipeline BeamApp-jenkins-0325181709-516ba346_d7972792-2c60-4332-8107-ac73c585491a failed in state FAILED: java.lang.RuntimeException: Error received from SDK harness for instruction 19: org.apache.beam.sdk.util.UserCodeException: java.sql.SQLException: Cannot load JDBC driver class 'com.mysql.cj.jdbc.Driver' {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14184) DirectStreamObserver does not respect channel isReady
Luke Cwik created BEAM-14184: Summary: DirectStreamObserver does not respect channel isReady Key: BEAM-14184 URL: https://issues.apache.org/jira/browse/BEAM-14184 Project: Beam Issue Type: Bug Components: sdk-java-harness Reporter: Luke Cwik Assignee: Luke Cwik Leads to OOMs like: {noformat} Output channel stalled for 1023s, outbound thread CHAIN MapPartition (MapPartition at [1]PerformInference) -> FlatMap (FlatMap at ExtractOutput[0]) -> Map (Key Extractor) -> GroupCombine (GroupCombine at GroupCombine: PerformInferenceAndCombineResults_dep_049/GroupPredictionsByImage) -> Map (Key Extractor) (1/1). See: https://issues.apache.org/jira/browse/BEAM-4280 for the history for this issue. Feb 18, 2022 11:51:05 AM org.apache.beam.vendor.grpc.v1p36p0.io.grpc.netty.NettyServerTransport notifyTerminated INFO: Transport failed org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 2097152 byte(s) of direct memory (used: 1205862679, max: 1207959552) at org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:754) at org.apache.beam.vendor.grpc.v1p36p0.io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:709) at org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645) at org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:621) at org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:204) at org.apache.beam.vendor.grpc.v1p36p0.io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:188) {noformat} See more context in https://lists.apache.org/thread/llmxodbmczhn10c98prs8wmd5hy4nvff -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues
[ https://issues.apache.org/jira/browse/BEAM-14179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512570#comment-17512570 ] Luke Cwik commented on BEAM-14179: -- The required_labels field tells us which https://github.com/apache/beam/blob/d62b5d6c43847a3b5ecb1321144cf10c6931bd98/model/pipeline/src/main/proto/metrics.proto#L311 labels are necessary. This reduces the problem now down to what should be done when the value is unknown. > MonitoringInfoMetricName null value guard uncovering additional issues > -- > > Key: BEAM-14179 > URL: https://issues.apache.org/jira/browse/BEAM-14179 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, sdk-java-harness >Reporter: Luke Cwik >Assignee: Daniel Oliveira >Priority: P2 > Fix For: 2.38.0 > > > Additional integration testing > (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that > https://github.com/apache/beam/pull/17094 causes a regression: > The test failed with: > {noformat} > Caused by: java.lang.NullPointerException: null value in entry: > DATASTORE_NAMESPACE=null > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437) > at > org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46) > at > org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93) > at > org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82) > at > org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927) > at > org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues
[ https://issues.apache.org/jira/browse/BEAM-14179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512561#comment-17512561 ] Luke Cwik commented on BEAM-14179: -- The question is what is the intended outcome in this case for these service call metrics: 1) It could be that they are required, then we should fail rightly in the code 2) They are optional and we should pass in some value like "null", "", or "N/A" 3) The key can be filtered from the map if the value is null. > MonitoringInfoMetricName null value guard uncovering additional issues > -- > > Key: BEAM-14179 > URL: https://issues.apache.org/jira/browse/BEAM-14179 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, sdk-java-harness >Reporter: Luke Cwik >Assignee: Daniel Oliveira >Priority: P2 > Fix For: 2.38.0 > > > Additional integration testing > (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that > https://github.com/apache/beam/pull/17094 causes a regression: > The test failed with: > {noformat} > Caused by: java.lang.NullPointerException: null value in entry: > DATASTORE_NAMESPACE=null > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437) > at > org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46) > at > org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93) > at > org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82) > at > org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927) > at > org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues
[ https://issues.apache.org/jira/browse/BEAM-14179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14179: - Status: Open (was: Triage Needed) > MonitoringInfoMetricName null value guard uncovering additional issues > -- > > Key: BEAM-14179 > URL: https://issues.apache.org/jira/browse/BEAM-14179 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, sdk-java-harness >Reporter: Luke Cwik >Assignee: Daniel Oliveira >Priority: P2 > Fix For: 2.38.0 > > > Additional integration testing > (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that > https://github.com/apache/beam/pull/17094 causes a regression: > The test failed with: > {noformat} > Caused by: java.lang.NullPointerException: null value in entry: > DATASTORE_NAMESPACE=null > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437) > at > org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46) > at > org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93) > at > org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82) > at > org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927) > at > org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues
Luke Cwik created BEAM-14179: Summary: MonitoringInfoMetricName null value guard uncovering additional issues Key: BEAM-14179 URL: https://issues.apache.org/jira/browse/BEAM-14179 Project: Beam Issue Type: Bug Components: io-java-gcp, sdk-java-harness Reporter: Luke Cwik Assignee: Daniel Oliveira Fix For: 2.38.0 Additional integration testing (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that https://github.com/apache/beam/pull/17094 causes a regression: The test failed with: {noformat} Caused by: java.lang.NullPointerException: null value in entry: DATASTORE_NAMESPACE=null at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437) at org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46) at org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93) at org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82) at org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927) at org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14177) GroupByKey iteration caching broken for portable runners like Dataflow runner v2
[ https://issues.apache.org/jira/browse/BEAM-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14177: - Description: The wrong cache key is being used as it has not been namespaced to the state key. This was previously being done within StateFetchingIterators but https://github.com/apache/beam/pull/17121 changed that to use a single shared key. The fix is to subcache the cache before passing it into StateFetchingIterators restoring the prior behavior. was: The wrong cache key is being used as it has not been namespaced to the state key. This was previously being done within StateFetchingIterators but https://github.com/apache/beam/pull/17121 changed that to use a single shared key. > GroupByKey iteration caching broken for portable runners like Dataflow runner > v2 > > > Key: BEAM-14177 > URL: https://issues.apache.org/jira/browse/BEAM-14177 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The wrong cache key is being used as it has not been namespaced to the state > key. > This was previously being done within StateFetchingIterators but > https://github.com/apache/beam/pull/17121 changed that to use a single shared > key. > The fix is to subcache the cache before passing it into > StateFetchingIterators restoring the prior behavior. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14177) GroupByKey iteration caching broken for portable runners like Dataflow runner v2
[ https://issues.apache.org/jira/browse/BEAM-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14177: - Summary: GroupByKey iteration caching broken for portable runners like Dataflow runner v2 (was: GroupByKey re-iteration caching broken for portable runners like Dataflow runner v2) > GroupByKey iteration caching broken for portable runners like Dataflow runner > v2 > > > Key: BEAM-14177 > URL: https://issues.apache.org/jira/browse/BEAM-14177 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The wrong cache key is being used as it has not been namespaced to the state > key. > This was previously being done within StateFetchingIterators but > https://github.com/apache/beam/pull/17121 changed that to use a single shared > key. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14177) GroupByKey re-iteration caching broken for portable runners like Dataflow runner v2
[ https://issues.apache.org/jira/browse/BEAM-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14177: - Summary: GroupByKey re-iteration caching broken for portable runners like Dataflow runner v2 (was: GroupByKey re-iteration broken for portable runners like Dataflow runner v2) > GroupByKey re-iteration caching broken for portable runners like Dataflow > runner v2 > --- > > Key: BEAM-14177 > URL: https://issues.apache.org/jira/browse/BEAM-14177 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The wrong cache key is being used as it has not been namespaced to the state > key. > This was previously being done within StateFetchingIterators but > https://github.com/apache/beam/pull/17121 changed that to use a single shared > key. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14177) GroupByKey re-iteration broken for portable runners like Dataflow runner v2
[ https://issues.apache.org/jira/browse/BEAM-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14177: - Summary: GroupByKey re-iteration broken for portable runners like Dataflow runner v2 (was: GroupByKey re-iteration broken for Dataflow runner v2) > GroupByKey re-iteration broken for portable runners like Dataflow runner v2 > --- > > Key: BEAM-14177 > URL: https://issues.apache.org/jira/browse/BEAM-14177 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The wrong cache key is being used as it has not been namespaced to the state > key. > This was previously being done within StateFetchingIterators but > https://github.com/apache/beam/pull/17121 changed that to use a single shared > key. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14177) GroupByKey re-iteration broken for Dataflow runner v2
Luke Cwik created BEAM-14177: Summary: GroupByKey re-iteration broken for Dataflow runner v2 Key: BEAM-14177 URL: https://issues.apache.org/jira/browse/BEAM-14177 Project: Beam Issue Type: Bug Components: sdk-java-harness Reporter: Luke Cwik Assignee: Luke Cwik Fix For: 2.38.0 The wrong cache key is being used as it has not been namespaced to the state key. This was previously being done within StateFetchingIterators but https://github.com/apache/beam/pull/17121 changed that to use a single shared key. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14116) Fix Pub/Sub Lite IO and SDF performance issues with shuffles
[ https://issues.apache.org/jira/browse/BEAM-14116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512074#comment-17512074 ] Luke Cwik commented on BEAM-14116: -- https://github.com/apache/beam/pull/17004 seems to have broken the internal cloud/dataflow/testing/integration/sdk:streaming_GroupByKeyTest_BasicTests_testLargeKeys1MB_wm_service_local test reliably for Dataflow. Please work with the release manager [~danoliveira] to rollback or forward fix. > Fix Pub/Sub Lite IO and SDF performance issues with shuffles > > > Key: BEAM-14116 > URL: https://issues.apache.org/jira/browse/BEAM-14116 > Project: Beam > Issue Type: Task > Components: io-java-gcp, runner-dataflow >Reporter: Daniel Collins >Assignee: Daniel Collins >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-3305) Consider: Go ViewFn support
[ https://issues.apache.org/jira/browse/BEAM-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-3305: Description: We can add it as an optional field to beam.SideInput (see: pkg/beam/option.go). The execution side needs to support it as well. However, with the various side input forms, it's not clear how valuable such a feature would be. It is strongly recommended to add Map Side Inputs https://issues.apache.org/jira/browse/BEAM-3293 before implementing this suggestion, and required to have caching implemented https://issues.apache.org/jira/browse/BEAM-11097. Otherwise very little benefit is acheived. See https://issues.apache.org/jira/browse/BEAM-3293 for where code might need to be changed. Make providing a ViewFn part of the beam.SideInput struct, either as a method to validate & update unexported fields, or adding an optional field. The ViewFn would be a function that takes in the iterator/multimap version of the Side Input and returns whatever the pre-processed type is. That value would be what is cached to reduce work per common window. The DoFn should then be able to use that type in the position of the ProcessElement method for that side input. This avoids issues that only work in of the Global Window like trying to process the side input in StartBundle. Something would need to be changed about handling the DoFn method signature analysis to support additional alternative side input types. was: We can add it as an optional field to beam.SideInput (see: pkg/beam/option.go). The execution side needs to support it as well. However, with the various side input forms, it's not clear how valuable such a feature would be. It is strongly recommended to add Map Side Inputs https://issues.apache.org/jira/browse/BEAM-3293 before implementing this suggestion, and required to have caching implemented https://issues.apache.org/jira/browse/BEAM-11097. Otherwise very little benefit is acheived. See https://issues.apache.org/jira/browse/BEAM-3293 for where code might need to be changed. Make providing a ViewFn part of the beam.SideInput struct, either as a method to validate & update unexported fields, or adding an optional field. The ViewFn would be a function that takes in the iterator version of the Side Input and returns whatever the pre-processed type is. That value would be what is cached to reduce work per common window. The DoFn should then be able to use that type in the position of the ProcessElement method for that side input. This avoids issues that only work in of the Global Window like trying to process the side input in StartBundle. Something would need to be changed about handling the DoFn method signature analysis to support additional alternative side input types. > Consider: Go ViewFn support > --- > > Key: BEAM-3305 > URL: https://issues.apache.org/jira/browse/BEAM-3305 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Priority: P3 > > We can add it as an optional field to beam.SideInput (see: > pkg/beam/option.go). The execution side needs to support it as well. However, > with the various side input forms, it's not clear how valuable such a feature > would be. > It is strongly recommended to add Map Side Inputs > https://issues.apache.org/jira/browse/BEAM-3293 before implementing this > suggestion, and required to have caching implemented > https://issues.apache.org/jira/browse/BEAM-11097. Otherwise very little > benefit is acheived. > See https://issues.apache.org/jira/browse/BEAM-3293 for where code might need > to be changed. > Make providing a ViewFn part of the beam.SideInput struct, either as a method > to validate & update unexported fields, or adding an optional field. > The ViewFn would be a function that takes in the iterator/multimap version of > the Side Input and returns whatever the pre-processed type is. That value > would be what is cached to reduce work per common window. > The DoFn should then be able to use that type in the position of the > ProcessElement method for that side input. This avoids issues that only work > in of the Global Window like trying to process the side input in StartBundle. > Something would need to be changed about handling the DoFn method signature > analysis to support additional alternative side input types. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-11325) KafkaIO should be able to read from new added topic/partition automatically during pipeline execution time
[ https://issues.apache.org/jira/browse/BEAM-11325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511311#comment-17511311 ] Luke Cwik commented on BEAM-11325: -- We should use BEAM-13852 to track the fix. > KafkaIO should be able to read from new added topic/partition automatically > during pipeline execution time > -- > > Key: BEAM-11325 > URL: https://issues.apache.org/jira/browse/BEAM-11325 > Project: Beam > Issue Type: New Feature > Components: io-java-kafka >Reporter: Boyuan Zhang >Assignee: Boyuan Zhang >Priority: P2 > Fix For: 2.29.0 > > Time Spent: 7.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14134) Many coders cause significant unnecessary allocations
[ https://issues.apache.org/jira/browse/BEAM-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14134: - Fix Version/s: 2.38.0 Resolution: Fixed Status: Resolved (was: Open) > Many coders cause significant unnecessary allocations > - > > Key: BEAM-14134 > URL: https://issues.apache.org/jira/browse/BEAM-14134 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Steve Niemitz >Assignee: Steve Niemitz >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Many coders (BigEndian*, Map, Iterable, Instant) use DataInputStream to read > longs/ints/shorts. Internally each DataInputStream allocates ~200 bytes of > buffers when instantiated. This means every long, int, short, etc decoded > allocates over 200 bytes. > We should eliminate all uses of DataInputStream in hot-paths and replace it > with something more efficient. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (BEAM-13695) Provide more accurate size estimates for cache objects in Java 17
[ https://issues.apache.org/jira/browse/BEAM-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-13695: Assignee: Kiley Sok > Provide more accurate size estimates for cache objects in Java 17 > - > > Key: BEAM-13695 > URL: https://issues.apache.org/jira/browse/BEAM-13695 > Project: Beam > Issue Type: Improvement > Components: sdk-java-harness >Reporter: Kiley Sok >Assignee: Kiley Sok >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14154) Collect And Deploy Playground Examples workflow consistently fails
[ https://issues.apache.org/jira/browse/BEAM-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510924#comment-17510924 ] Luke Cwik commented on BEAM-14154: -- I manually disabled the workflow from the github UI. !disabled.png! > Collect And Deploy Playground Examples workflow consistently fails > -- > > Key: BEAM-14154 > URL: https://issues.apache.org/jira/browse/BEAM-14154 > Project: Beam > Issue Type: Bug > Components: beam-playground, test-failures >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: P2 > Attachments: disabled.png > > > https://github.com/apache/beam/actions/workflows/playground_deploy_examples.yml?query=is%3Afailure > is failing consistently. Has failed 344 times and have never succeeded in > the past 3 months. > Fails with: > {noformat} > Run helm del --namespace $K8S_NAMESPACE $HELM_APP_NAME > Error: Kubernetes cluster unreachable: Get "http://localhost:8080/version": > dial tcp [::1]:8080: connect: connection refused > Error: Process completed with exit code 1. > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14154) Collect And Deploy Playground Examples workflow consistently fails
[ https://issues.apache.org/jira/browse/BEAM-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14154: - Attachment: disabled.png > Collect And Deploy Playground Examples workflow consistently fails > -- > > Key: BEAM-14154 > URL: https://issues.apache.org/jira/browse/BEAM-14154 > Project: Beam > Issue Type: Bug > Components: beam-playground, test-failures >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: P2 > Attachments: disabled.png > > > https://github.com/apache/beam/actions/workflows/playground_deploy_examples.yml?query=is%3Afailure > is failing consistently. Has failed 344 times and have never succeeded in > the past 3 months. > Fails with: > {noformat} > Run helm del --namespace $K8S_NAMESPACE $HELM_APP_NAME > Error: Kubernetes cluster unreachable: Get "http://localhost:8080/version": > dial tcp [::1]:8080: connect: connection refused > Error: Process completed with exit code 1. > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14154) Collect And Deploy Playground Examples workflow consistently fails
Luke Cwik created BEAM-14154: Summary: Collect And Deploy Playground Examples workflow consistently fails Key: BEAM-14154 URL: https://issues.apache.org/jira/browse/BEAM-14154 Project: Beam Issue Type: Bug Components: beam-playground, test-failures Reporter: Luke Cwik Assignee: Pablo Estrada https://github.com/apache/beam/actions/workflows/playground_deploy_examples.yml?query=is%3Afailure is failing consistently. Has failed 344 times and have never succeeded in the past 3 months. Fails with: {noformat} Run helm del --namespace $K8S_NAMESPACE $HELM_APP_NAME Error: Kubernetes cluster unreachable: Get "http://localhost:8080/version": dial tcp [::1]:8080: connect: connection refused Error: Process completed with exit code 1. {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14154) Collect And Deploy Playground Examples workflow consistently fails
[ https://issues.apache.org/jira/browse/BEAM-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14154: - Status: Open (was: Triage Needed) > Collect And Deploy Playground Examples workflow consistently fails > -- > > Key: BEAM-14154 > URL: https://issues.apache.org/jira/browse/BEAM-14154 > Project: Beam > Issue Type: Bug > Components: beam-playground, test-failures >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: P2 > > https://github.com/apache/beam/actions/workflows/playground_deploy_examples.yml?query=is%3Afailure > is failing consistently. Has failed 344 times and have never succeeded in > the past 3 months. > Fails with: > {noformat} > Run helm del --namespace $K8S_NAMESPACE $HELM_APP_NAME > Error: Kubernetes cluster unreachable: Get "http://localhost:8080/version": > dial tcp [::1]:8080: connect: connection refused > Error: Process completed with exit code 1. > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14152) SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky
[ https://issues.apache.org/jira/browse/BEAM-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14152: - Status: Open (was: Triage Needed) > SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky > -- > > Key: BEAM-14152 > URL: https://issues.apache.org/jira/browse/BEAM-14152 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, test-failures >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: P2 > Time Spent: 0.5h > Remaining Estimate: 0h > > Example failures: > * > https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21726/testReport/junit/org.apache.beam.sdk.io.gcp.spanner.changestreams/SpannerChangeStreamErrorTest/testUnavailableExceptionRetries/ > * > https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21723/testReport/junit/org.apache.beam.sdk.io.gcp.spanner.changestreams/SpannerChangeStreamErrorTest/testUnavailableExceptionRetries/ > {noformat} > java.lang.AssertionError: > Expected: a value greater than <1> > but: <0> was less than <1> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.beam.sdk.io.gcp.spanner.changestreams.SpannerChangeStreamErrorTest.testUnavailableExceptionRetries(SpannerChangeStreamErrorTest.java:166) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:323) > at > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14132) Java 17 slow for nexmark query 3
[ https://issues.apache.org/jira/browse/BEAM-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510909#comment-17510909 ] Luke Cwik commented on BEAM-14132: -- Query 3 uses a stateful DoFn. A large difference between Java 17 and prior versions is that Jamm can't measure the object sizes that are placed in the cache so the object sizing isn't accurate and also incurs a cost where it throws an exception and a backup method is used: https://github.com/apache/beam/blob/c2d4e5162afdc5ad9d32da7eca681581322e515f/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/Caches.java#L70 Should be addressed for benchmarks with https://github.com/apache/beam/pull/17110 but this solution requires users to still add modules for jamm where we could have enumerated them. This might eventually be part of the JVM as per https://openjdk.java.net/jeps/8249196 > Java 17 slow for nexmark query 3 > > > Key: BEAM-14132 > URL: https://issues.apache.org/jira/browse/BEAM-14132 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kiley Sok >Priority: P2 > Attachments: Screen Shot 2022-03-18 at 5.09.00 PM.png > > > http://metrics.beam.apache.org/d/8INnSY9Mz/nexmark-dataflow-runner-v2?orgId=1&var-processingType=streaming&var-ID=All&from=now-30d&to=now > For streaming runner v2, Java 17 is twice as slow as Java8/11 on query 3 > https://github.com/apache/beam/blob/master/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/Query3.java -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14152) SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky
[ https://issues.apache.org/jira/browse/BEAM-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510907#comment-17510907 ] Luke Cwik commented on BEAM-14152: -- Test disabled in https://github.com/apache/beam/pull/17158 > SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky > -- > > Key: BEAM-14152 > URL: https://issues.apache.org/jira/browse/BEAM-14152 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, test-failures >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: P2 > Time Spent: 20m > Remaining Estimate: 0h > > Example failures: > * > https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21726/testReport/junit/org.apache.beam.sdk.io.gcp.spanner.changestreams/SpannerChangeStreamErrorTest/testUnavailableExceptionRetries/ > * > https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21723/testReport/junit/org.apache.beam.sdk.io.gcp.spanner.changestreams/SpannerChangeStreamErrorTest/testUnavailableExceptionRetries/ > {noformat} > java.lang.AssertionError: > Expected: a value greater than <1> > but: <0> was less than <1> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.beam.sdk.io.gcp.spanner.changestreams.SpannerChangeStreamErrorTest.testUnavailableExceptionRetries(SpannerChangeStreamErrorTest.java:166) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:323) > at > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14152) SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky
Luke Cwik created BEAM-14152: Summary: SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky Key: BEAM-14152 URL: https://issues.apache.org/jira/browse/BEAM-14152 Project: Beam Issue Type: Bug Components: io-java-gcp, test-failures Reporter: Luke Cwik Assignee: Pablo Estrada Example failures: * https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21726/testReport/junit/org.apache.beam.sdk.io.gcp.spanner.changestreams/SpannerChangeStreamErrorTest/testUnavailableExceptionRetries/ * https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21723/testReport/junit/org.apache.beam.sdk.io.gcp.spanner.changestreams/SpannerChangeStreamErrorTest/testUnavailableExceptionRetries/ {noformat} java.lang.AssertionError: Expected: a value greater than <1> but: <0> was less than <1> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.beam.sdk.io.gcp.spanner.changestreams.SpannerChangeStreamErrorTest.testUnavailableExceptionRetries(SpannerChangeStreamErrorTest.java:166) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:323) at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (BEAM-14038) Add the ability to easily start up Python expansion services.
[ https://issues.apache.org/jira/browse/BEAM-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510875#comment-17510875 ] Luke Cwik edited comment on BEAM-14038 at 3/22/22, 8:19 PM: Filed https://issues.apache.org/jira/browse/BEAM-14148 for the extremely flaky ExternalPythonTransformTest.trivialPythonTransform test. Started rollback of https://github.com/apache/beam/pull/17035 in https://github.com/apache/beam/pull/17154 was (Author: lcwik): Filed https://issues.apache.org/jira/browse/BEAM-14148 for the extremely flaky ExternalPythonTransformTest.trivialPythonTransform test. Started rollback in https://github.com/apache/beam/pull/17154 > Add the ability to easily start up Python expansion services. > - > > Key: BEAM-14038 > URL: https://issues.apache.org/jira/browse/BEAM-14038 > Project: Beam > Issue Type: Improvement > Components: cross-language >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: P2 > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14038) Add the ability to easily start up Python expansion services.
[ https://issues.apache.org/jira/browse/BEAM-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510875#comment-17510875 ] Luke Cwik commented on BEAM-14038: -- Filed https://issues.apache.org/jira/browse/BEAM-14148 for the extremely flaky ExternalPythonTransformTest.trivialPythonTransform test. Started rollback in https://github.com/apache/beam/pull/17154 > Add the ability to easily start up Python expansion services. > - > > Key: BEAM-14038 > URL: https://issues.apache.org/jira/browse/BEAM-14038 > Project: Beam > Issue Type: Improvement > Components: cross-language >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: P2 > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14148) ExternalPythonTransformTest.trivialPythonTransform flaky
[ https://issues.apache.org/jira/browse/BEAM-14148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510876#comment-17510876 ] Luke Cwik commented on BEAM-14148: -- Started rollback of https://github.com/apache/beam/pull/17035 in https://github.com/apache/beam/pull/17154 > ExternalPythonTransformTest.trivialPythonTransform flaky > > > Key: BEAM-14148 > URL: https://issues.apache.org/jira/browse/BEAM-14148 > Project: Beam > Issue Type: Bug > Components: cross-language, test-failures >Reporter: Luke Cwik >Assignee: Robert Bradshaw >Priority: P2 > Time Spent: 10m > Remaining Estimate: 0h > > Example run: > https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/4806/testReport/junit/org.apache.beam.sdk.extensions.python/ExternalPythonTransformTest/trivialPythonTransform/ > {noformat} > java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timeout > waiting for Python service startup after 16616 seconds. > at > org.apache.beam.sdk.extensions.python.ExternalPythonTransform.expand(ExternalPythonTransform.java:107) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482) > at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:363) > at > org.apache.beam.sdk.extensions.python.ExternalPythonTransformTest.trivialPythonTransform(ExternalPythonTransformTest.java:41) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14148) ExternalPythonTransformTest.trivialPythonTransform flaky
Luke Cwik created BEAM-14148: Summary: ExternalPythonTransformTest.trivialPythonTransform flaky Key: BEAM-14148 URL: https://issues.apache.org/jira/browse/BEAM-14148 Project: Beam Issue Type: Bug Components: cross-language, test-failures Reporter: Luke Cwik Assignee: Robert Bradshaw Example run: https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/4806/testReport/junit/org.apache.beam.sdk.extensions.python/ExternalPythonTransformTest/trivialPythonTransform/ {noformat} java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timeout waiting for Python service startup after 16616 seconds. at org.apache.beam.sdk.extensions.python.ExternalPythonTransform.expand(ExternalPythonTransform.java:107) at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548) at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482) at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:363) at org.apache.beam.sdk.extensions.python.ExternalPythonTransformTest.trivialPythonTransform(ExternalPythonTransformTest.java:41) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-11325) KafkaIO should be able to read from new added topic/partition automatically during pipeline execution time
[ https://issues.apache.org/jira/browse/BEAM-11325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510121#comment-17510121 ] Luke Cwik commented on BEAM-11325: -- This is not working, see https://issues.apache.org/jira/browse/BEAM-13852 as a discussion around fixing this. > KafkaIO should be able to read from new added topic/partition automatically > during pipeline execution time > -- > > Key: BEAM-11325 > URL: https://issues.apache.org/jira/browse/BEAM-11325 > Project: Beam > Issue Type: New Feature > Components: io-java-kafka >Reporter: Boyuan Zhang >Assignee: Boyuan Zhang >Priority: P2 > Fix For: 2.29.0 > > Time Spent: 7.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-13852) KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions
[ https://issues.apache.org/jira/browse/BEAM-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510120#comment-17510120 ] Luke Cwik commented on BEAM-13852: -- DoFn can't use a processing time timer since the timer won't fire anymore once the watermark advances to infinity. Processing time timers only fire while the watermark isn't at infinity/MAX. It would likely be best to replace the implementation with the Watch[1] transform which will hold up the watermark appropriately and was built for this kind of use case already. Discussion in https://lists.apache.org/thread/fo6jrg15pwd6bjpl6k8g4l0z953mp7ry 1: https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Watch.java > KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions > -- > > Key: BEAM-13852 > URL: https://issues.apache.org/jira/browse/BEAM-13852 > Project: Beam > Issue Type: Bug > Components: io-java-kafka >Reporter: John Casey >Assignee: John Casey >Priority: P2 > > KafkaIO.Read().withDynamicRead() is correctly pulling messages from existing > topics, but isn't picking up new topics or partitions when they are created. > Currently, this appears to be an interaction between the timer configuration, > and the duration of the operating window, that is causing the problem -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14064) ElasticSearchIO#Write buffering and outputting across windows
[ https://issues.apache.org/jira/browse/BEAM-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508393#comment-17508393 ] Luke Cwik commented on BEAM-14064: -- Set the fix version > ElasticSearchIO#Write buffering and outputting across windows > - > > Key: BEAM-14064 > URL: https://issues.apache.org/jira/browse/BEAM-14064 > Project: Beam > Issue Type: Bug > Components: io-java-elasticsearch >Affects Versions: 2.35.0, 2.36.0, 2.37.0 >Reporter: Luke Cwik >Assignee: Evan Galpin >Priority: P1 > Fix For: 2.38.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Source: https://lists.apache.org/thread/mtwtno2o88lx3zl12jlz7o5w1lcgm2db > Bug PR: https://github.com/apache/beam/pull/15381 > ElasticsearchIO is collecting results from elements in window X and then > trying to output them in window Y when flushing the batch. This exposed a bug > where elements that were being buffered were being output as part of a > different window than what the window that produced them was. > This became visible because validation was added recently to ensure that when > the pipeline is processing elements in window X that output with a timestamp > is valid for window X. Note that this validation only occurs in > *@ProcessElement* since output is associated with the current window with the > input element that is being processed. > It is ok to do this in *@FinishBundle* since there is no existing windowing > context and when you output that element is assigned to an appropriate window. > *Further Context* > We’ve bisected it to being introduced in 2.35.0, and I’m reasonably certain > it’s this PR https://github.com/apache/beam/pull/15381 > Our scenario is pretty trivial, we read off Pubsub and write to Elastic in a > streaming job, the config for the source and sink is respectively > {noformat} > pipeline.apply( > PubsubIO.readStrings().fromSubscription(subscription) > ).apply(ParseJsons.of(OurObject::class.java)) > .setCoder(KryoCoder.of()) > {noformat} > and > {noformat} > ElasticsearchIO.write() > .withUseStatefulBatches(true) > .withMaxParallelRequestsPerWindow(1) > .withMaxBufferingDuration(Duration.standardSeconds(30)) > // 5 bytes **> KiB **> MiB, so 5 MiB > .withMaxBatchSizeBytes(5L * 1024 * 1024) > // # of docs > .withMaxBatchSize(1000) > .withConnectionConfiguration( > ElasticsearchIO.ConnectionConfiguration.create( > arrayOf(host), > "fubar", > "_doc" > ).withConnectTimeout(5000) > .withSocketTimeout(3) > ) > .withRetryConfiguration( > ElasticsearchIO.RetryConfiguration.create( > 10, > // the duration is wall clock, against the connection and > socket timeouts specified > // above. I.e., 10 x 30s is gonna be more than 3 minutes, > so if we're getting > // 10 socket timeouts in a row, this would ignore the > "10" part and terminate > // after 6. The idea is that in a mixed failure mode, > you'd get different timeouts > // of different durations, and on average 10 x fails < 4m. > // That said, 4m is arbitrary, so adjust as and when > needed. > Duration.standardMinutes(4) > ) > ) > .withIdFn { f: JsonNode -> f["id"].asText() } > .withIndexFn { f: JsonNode -> f["schema_name"].asText() } > .withIsDeleteFn { f: JsonNode -> f["_action"].asText("noop") == > "delete" } > {noformat} > We recently tried upgrading 2.33 to 2.36 and immediately hit a bug in the > consumer, due to alleged time skew, specifically > {noformat} > 2022-03-07 10:48:37.886 GMTError message from worker: > java.lang.IllegalArgumentException: Cannot output with timestamp > 2022-03-07T10:43:38.640Z. Output timestamps must be no earlier than the > timestamp of the > current input (2022-03-07T10:43:43.562Z) minus the allowed skew (0 > milliseconds) and no later than 294247-01-10T04:00:54.775Z. See the > DoFn#getAllowedTimestampSkew() Javadoc > for details on changing the allowed skew. > org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:446) > > org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:422) > > org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$B
[jira] [Updated] (BEAM-14027) Java PreCommit flaky org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful
[ https://issues.apache.org/jira/browse/BEAM-14027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14027: - Fix Version/s: Not applicable Resolution: Fixed Status: Resolved (was: Open) > Java PreCommit flaky > org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful > --- > > Key: BEAM-14027 > URL: https://issues.apache.org/jira/browse/BEAM-14027 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Luke Cwik >Assignee: Reuven Lax >Priority: P2 > Fix For: Not applicable > > Time Spent: 20m > Remaining Estimate: 0h > > Looks like https://github.com/apache/beam/pull/16727 is using the static map > concurrently which is not thread safe. > Causing failures like: > {noformat} > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextNode(HashMap.java:1469) > at java.util.HashMap$ValueIterator.next(HashMap.java:1498) > at java.util.AbstractCollection.toString(AbstractCollection.java:461) > at java.lang.String.valueOf(String.java:2994) > at java.lang.StringBuilder.append(StringBuilder.java:131) > at > org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.validate(ParDoLifecycleTest.java:392) > at > org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.access$300(ParDoLifecycleTest.java:356) > at > org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful(ParDoLifecycleTest.java:269) > {noformat} > See failure in: > https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21291/testReport/junit/org.apache.beam.sdk.transforms/ParDoLifecycleTest/testTeardownCalledAfterExceptionInProcessElementStateful/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-14064) ElasticSearchIO#Write buffering and outputting across windows
[ https://issues.apache.org/jira/browse/BEAM-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-14064: - Description: Source: https://lists.apache.org/thread/mtwtno2o88lx3zl12jlz7o5w1lcgm2db Bug PR: https://github.com/apache/beam/pull/15381 ElasticsearchIO is collecting results from elements in window X and then trying to output them in window Y when flushing the batch. This exposed a bug where elements that were being buffered were being output as part of a different window than what the window that produced them was. This became visible because validation was added recently to ensure that when the pipeline is processing elements in window X that output with a timestamp is valid for window X. Note that this validation only occurs in *@ProcessElement* since output is associated with the current window with the input element that is being processed. It is ok to do this in *@FinishBundle* since there is no existing windowing context and when you output that element is assigned to an appropriate window. *Further Context* We’ve bisected it to being introduced in 2.35.0, and I’m reasonably certain it’s this PR https://github.com/apache/beam/pull/15381 Our scenario is pretty trivial, we read off Pubsub and write to Elastic in a streaming job, the config for the source and sink is respectively {noformat} pipeline.apply( PubsubIO.readStrings().fromSubscription(subscription) ).apply(ParseJsons.of(OurObject::class.java)) .setCoder(KryoCoder.of()) {noformat} and {noformat} ElasticsearchIO.write() .withUseStatefulBatches(true) .withMaxParallelRequestsPerWindow(1) .withMaxBufferingDuration(Duration.standardSeconds(30)) // 5 bytes **> KiB **> MiB, so 5 MiB .withMaxBatchSizeBytes(5L * 1024 * 1024) // # of docs .withMaxBatchSize(1000) .withConnectionConfiguration( ElasticsearchIO.ConnectionConfiguration.create( arrayOf(host), "fubar", "_doc" ).withConnectTimeout(5000) .withSocketTimeout(3) ) .withRetryConfiguration( ElasticsearchIO.RetryConfiguration.create( 10, // the duration is wall clock, against the connection and socket timeouts specified // above. I.e., 10 x 30s is gonna be more than 3 minutes, so if we're getting // 10 socket timeouts in a row, this would ignore the "10" part and terminate // after 6. The idea is that in a mixed failure mode, you'd get different timeouts // of different durations, and on average 10 x fails < 4m. // That said, 4m is arbitrary, so adjust as and when needed. Duration.standardMinutes(4) ) ) .withIdFn { f: JsonNode -> f["id"].asText() } .withIndexFn { f: JsonNode -> f["schema_name"].asText() } .withIsDeleteFn { f: JsonNode -> f["_action"].asText("noop") == "delete" } {noformat} We recently tried upgrading 2.33 to 2.36 and immediately hit a bug in the consumer, due to alleged time skew, specifically {noformat} 2022-03-07 10:48:37.886 GMTError message from worker: java.lang.IllegalArgumentException: Cannot output with timestamp 2022-03-07T10:43:38.640Z. Output timestamps must be no earlier than the timestamp of the current input (2022-03-07T10:43:43.562Z) minus the allowed skew (0 milliseconds) and no later than 294247-01-10T04:00:54.775Z. See the DoFn#getAllowedTimestampSkew() Javadoc for details on changing the allowed skew. org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:446) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:422) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn$ProcessContextAdapter.output(ElasticsearchIO.java:2364) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn.flushAndOutputResults(ElasticsearchIO.java:2404) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn.addAndMaybeFlush(ElasticsearchIO.java:2419) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOStatefulFn.processElement(ElasticsearchIO.java:2300) {noformat} I’ve bisected it and 2.34 works fine, 2.35 is the first version this breaks, and it seems like the code in the trace is largely added by the PR linked above. The error usually claims a skew of a few seconds, but obviously I can’t override getAllowedTimestampSkew() on the internal Elastic DoFn, and it’s marked deprecated anyway. I
[jira] [Created] (BEAM-14064) ElasticSearchIO#Write buffering and outputting across windows
Luke Cwik created BEAM-14064: Summary: ElasticSearchIO#Write buffering and outputting across windows Key: BEAM-14064 URL: https://issues.apache.org/jira/browse/BEAM-14064 Project: Beam Issue Type: Bug Components: io-java-elasticsearch Affects Versions: 2.36.0, 2.35.0 Reporter: Luke Cwik Assignee: Evan Galpin Source: https://lists.apache.org/thread/mtwtno2o88lx3zl12jlz7o5w1lcgm2db Bug PR: https://github.com/apache/beam/pull/15381 ElasticsearchIO is collecting results from elements in window X and then trying to output them in window Y when flushing the batch. This exposed a bug where elements that were being buffered were being output as part of a different window than what the window that produced them was. This became visible because validation was added recently to ensure that when the pipeline is processing elements in window X that output with a timestamp is valid for window X. Note that this validation only occurs in @ProcessElement since output is associated with the current window with the input element that is being processed. Further Context We’ve bisected it to being introduced in 2.35.0, and I’m reasonably certain it’s this PR https://github.com/apache/beam/pull/15381 Our scenario is pretty trivial, we read off Pubsub and write to Elastic in a streaming job, the config for the source and sink is respectively {noformat} pipeline.apply( PubsubIO.readStrings().fromSubscription(subscription) ).apply(ParseJsons.of(OurObject::class.java)) .setCoder(KryoCoder.of()) {noformat} and {noformat} ElasticsearchIO.write() .withUseStatefulBatches(true) .withMaxParallelRequestsPerWindow(1) .withMaxBufferingDuration(Duration.standardSeconds(30)) // 5 bytes **> KiB **> MiB, so 5 MiB .withMaxBatchSizeBytes(5L * 1024 * 1024) // # of docs .withMaxBatchSize(1000) .withConnectionConfiguration( ElasticsearchIO.ConnectionConfiguration.create( arrayOf(host), "fubar", "_doc" ).withConnectTimeout(5000) .withSocketTimeout(3) ) .withRetryConfiguration( ElasticsearchIO.RetryConfiguration.create( 10, // the duration is wall clock, against the connection and socket timeouts specified // above. I.e., 10 x 30s is gonna be more than 3 minutes, so if we're getting // 10 socket timeouts in a row, this would ignore the "10" part and terminate // after 6. The idea is that in a mixed failure mode, you'd get different timeouts // of different durations, and on average 10 x fails < 4m. // That said, 4m is arbitrary, so adjust as and when needed. Duration.standardMinutes(4) ) ) .withIdFn { f: JsonNode -> f["id"].asText() } .withIndexFn { f: JsonNode -> f["schema_name"].asText() } .withIsDeleteFn { f: JsonNode -> f["_action"].asText("noop") == "delete" } {noformat} We recently tried upgrading 2.33 to 2.36 and immediately hit a bug in the consumer, due to alleged time skew, specifically {noformat} 2022-03-07 10:48:37.886 GMTError message from worker: java.lang.IllegalArgumentException: Cannot output with timestamp 2022-03-07T10:43:38.640Z. Output timestamps must be no earlier than the timestamp of the current input (2022-03-07T10:43:43.562Z) minus the allowed skew (0 milliseconds) and no later than 294247-01-10T04:00:54.775Z. See the DoFn#getAllowedTimestampSkew() Javadoc for details on changing the allowed skew. org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:446) org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:422) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn$ProcessContextAdapter.output(ElasticsearchIO.java:2364) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn.flushAndOutputResults(ElasticsearchIO.java:2404) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOBaseFn.addAndMaybeFlush(ElasticsearchIO.java:2419) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIO$BulkIO$BulkIOStatefulFn.processElement(ElasticsearchIO.java:2300) {noformat} I’ve bisected it and 2.34 works fine, 2.35 is the first version this breaks, and it seems like the code in the trace is largely added by the PR linked above. The error usually claims a skew of a few seconds, but obviously I can’t override getAllo
[jira] [Created] (BEAM-14028) Java PreCommit flaky org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful
Luke Cwik created BEAM-14028: Summary: Java PreCommit flaky org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful Key: BEAM-14028 URL: https://issues.apache.org/jira/browse/BEAM-14028 Project: Beam Issue Type: Bug Components: test-failures Reporter: Luke Cwik Assignee: Reuven Lax It looks like the static callStatesMap is modified concurrently by multiple instances of the DoFn. Causes failures like (https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21291/testReport/junit/org.apache.beam.sdk.transforms/ParDoLifecycleTest/testTeardownCalledAfterExceptionInProcessElementStateful/): {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1469) at java.util.HashMap$ValueIterator.next(HashMap.java:1498) at java.util.AbstractCollection.toString(AbstractCollection.java:461) at java.lang.String.valueOf(String.java:2994) at java.lang.StringBuilder.append(StringBuilder.java:131) at org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.validate(ParDoLifecycleTest.java:392) at org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.access$300(ParDoLifecycleTest.java:356) at org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful(ParDoLifecycleTest.java:269) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Deleted] (BEAM-14028) Java PreCommit flaky org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful
[ https://issues.apache.org/jira/browse/BEAM-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik deleted BEAM-14028: - > Java PreCommit flaky > org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful > --- > > Key: BEAM-14028 > URL: https://issues.apache.org/jira/browse/BEAM-14028 > Project: Beam > Issue Type: Bug >Reporter: Luke Cwik >Assignee: Reuven Lax >Priority: P2 > > It looks like the static callStatesMap is modified concurrently by multiple > instances of the DoFn. > Causes failures like > (https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21291/testReport/junit/org.apache.beam.sdk.transforms/ParDoLifecycleTest/testTeardownCalledAfterExceptionInProcessElementStateful/): > {noformat} > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextNode(HashMap.java:1469) > at java.util.HashMap$ValueIterator.next(HashMap.java:1498) > at java.util.AbstractCollection.toString(AbstractCollection.java:461) > at java.lang.String.valueOf(String.java:2994) > at java.lang.StringBuilder.append(StringBuilder.java:131) > at > org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.validate(ParDoLifecycleTest.java:392) > at > org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.access$300(ParDoLifecycleTest.java:356) > at > org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful(ParDoLifecycleTest.java:269) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-14027) Java PreCommit flaky org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful
Luke Cwik created BEAM-14027: Summary: Java PreCommit flaky org.apache.beam.sdk.transforms/ParDoLifecycleTest#testTeardownCalledAfterExceptionInProcessElementStateful Key: BEAM-14027 URL: https://issues.apache.org/jira/browse/BEAM-14027 Project: Beam Issue Type: Bug Components: test-failures Reporter: Luke Cwik Assignee: Reuven Lax Looks like https://github.com/apache/beam/pull/16727 is using the static map concurrently which is not thread safe. Causing failures like: {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1469) at java.util.HashMap$ValueIterator.next(HashMap.java:1498) at java.util.AbstractCollection.toString(AbstractCollection.java:461) at java.lang.String.valueOf(String.java:2994) at java.lang.StringBuilder.append(StringBuilder.java:131) at org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.validate(ParDoLifecycleTest.java:392) at org.apache.beam.sdk.transforms.ParDoLifecycleTest$ExceptionThrowingFn.access$300(ParDoLifecycleTest.java:356) at org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful(ParDoLifecycleTest.java:269) {noformat} See failure in: https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/21291/testReport/junit/org.apache.beam.sdk.transforms/ParDoLifecycleTest/testTeardownCalledAfterExceptionInProcessElementStateful/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-12756) Negative value being returned for progress when using OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-12756: - Fix Version/s: Not applicable Resolution: Fixed Status: Resolved (was: Open) > Negative value being returned for progress when using OffsetRangeTracker > > > Key: BEAM-12756 > URL: https://issues.apache.org/jira/browse/BEAM-12756 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow, sdk-java-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P3 > Fix For: Not applicable > > Attachments: job_logs.png > > > range.getFrom() returns 1 and range.getTo() returns 9223372036854775807 > (Long.MAX_VALUE). In the returned Progress object, workCompleted is 0.0 and > workRemaining is 9.223372036854776E18 > Somehow, in the Dataflow backend, it becomes -9223372036854775808 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-12756) Negative value being returned for progress when using OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499626#comment-17499626 ] Luke Cwik commented on BEAM-12756: -- This was an internal Dataflow bug because of a conversion of a large double value needed to be turned into a saturated cast vs a static cast since in c++ a cast from a large double value beyond the bounds of an int64 returns the smallest negative integer. > Negative value being returned for progress when using OffsetRangeTracker > > > Key: BEAM-12756 > URL: https://issues.apache.org/jira/browse/BEAM-12756 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow, sdk-java-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P3 > Attachments: job_logs.png > > > range.getFrom() returns 1 and range.getTo() returns 9223372036854775807 > (Long.MAX_VALUE). In the returned Progress object, workCompleted is 0.0 and > workRemaining is 9.223372036854776E18 > Somehow, in the Dataflow backend, it becomes -9223372036854775808 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (BEAM-12756) Negative value being returned for progress when using OffsetRangeTracker
[ https://issues.apache.org/jira/browse/BEAM-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-12756: Assignee: Luke Cwik > Negative value being returned for progress when using OffsetRangeTracker > > > Key: BEAM-12756 > URL: https://issues.apache.org/jira/browse/BEAM-12756 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow, sdk-java-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P3 > Attachments: job_logs.png > > > range.getFrom() returns 1 and range.getTo() returns 9223372036854775807 > (Long.MAX_VALUE). In the returned Progress object, workCompleted is 0.0 and > workRemaining is 9.223372036854776E18 > Somehow, in the Dataflow backend, it becomes -9223372036854775808 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-14008) Java PreCommit failing at HEAD in sdks:java:fn-execution:analyzeClassesDepdencies
[ https://issues.apache.org/jira/browse/BEAM-14008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499144#comment-17499144 ] Luke Cwik commented on BEAM-14008: -- https://ci-beam.apache.org/job/beam_PreCommit_Java_Phrase/4672 got past the failures. > Java PreCommit failing at HEAD in > sdks:java:fn-execution:analyzeClassesDepdencies > - > > Key: BEAM-14008 > URL: https://issues.apache.org/jira/browse/BEAM-14008 > Project: Beam > Issue Type: Bug > Components: sdk-java-core, test-failures >Reporter: Brian Hulette >Assignee: Luke Cwik >Priority: P1 > Fix For: Not applicable > > Time Spent: 50m > Remaining Estimate: 0h > > First failure: https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/5144/ > {code} > 13:45:10 1: Task failed with an exception. > 13:45:10 --- > 13:45:10 * What went wrong: > 13:45:10 Execution failed for task > ':sdks:java:fn-execution:analyzeClassesDependencies'. > 13:45:10 > Dependency analysis found issues. > 13:45:10 usedUndeclaredArtifacts > 13:45:10- org.apache.avro:avro:1.8.2@jar > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13965) Polymorphic types not supported in PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13965: - Fix Version/s: 2.38.0 Resolution: Fixed Status: Resolved (was: Open) > Polymorphic types not supported in PipelineOptions > -- > > Key: BEAM-13965 > URL: https://issues.apache.org/jira/browse/BEAM-13965 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Moritz Mack >Assignee: Moritz Mack >Priority: P2 > Fix For: 2.38.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I noticed that polymorphic types using @JsonTypeInfo and @JsonSubTypes are > currently not supported in pipeline options as deserialization fails lacking > necessary type information. One has to provide a de/serializer and handle > things manually. > Looks like the deserialization code path should just follow the serialization > path. > > {noformat} > diff --git > a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java > > b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java > --- > a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java > (revision 53f5a3c1756509b6fab75d3946a9f95247d02184) > +++ > b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java > (date 1645114275316) > @@ -1725,7 +1725,8 @@ > } >} > > - private static JsonDeserializer > computeDeserializerForMethod(Method method) { > + private static Optional> > computeCustomDeserializerForMethod( > + Method method) { > try { >BeanProperty prop = createBeanProperty(method); >AnnotatedMember annotatedMethod = prop.getMember(); > @@ -1736,16 +1737,10 @@ >.getAnnotationIntrospector() >.findDeserializer(annotatedMethod); > > - JsonDeserializer jsonDeserializer = > + return Optional.fromNullable( >DESERIALIZATION_CONTEXT >.get() > - .deserializerInstance(annotatedMethod, maybeDeserializerClass); > - > - if (jsonDeserializer == null) { > -jsonDeserializer = > - > DESERIALIZATION_CONTEXT.get().findContextualValueDeserializer(prop.getType(), > prop); > - } > - return jsonDeserializer; > + .deserializerInstance(annotatedMethod, > maybeDeserializerClass)); > } catch (JsonMappingException e) { >throw new RuntimeException(e); > } > @@ -1771,11 +1766,12 @@ > * JsonDeserialize} the specified deserializer from the annotation is > returned, otherwise the > * default is returned. > */ > - private static JsonDeserializer getDeserializerForMethod(Method > method) { > + private static @Nullable JsonDeserializer > getCustomDeserializerForMethod(Method method) { > return CACHE > .get() > .deserializerCache > -.computeIfAbsent(method, > PipelineOptionsFactory::computeDeserializerForMethod); > +.computeIfAbsent(method, > PipelineOptionsFactory::computeCustomDeserializerForMethod) > +.orNull(); >} > >/** > @@ -1796,10 +1792,13 @@ >return null; > } > > +JsonDeserializer jsonDeserializer = > getCustomDeserializerForMethod(method); > +if (jsonDeserializer == null) { > + return DESERIALIZATION_CONTEXT.get().readTreeAsValue(node, > method.getReturnType()); > +} > + > JsonParser parser = new TreeTraversingParser(node, MAPPER); > parser.nextToken(); > - > -JsonDeserializer jsonDeserializer = > getDeserializerForMethod(method); > return jsonDeserializer.deserialize(parser, > DESERIALIZATION_CONTEXT.get()); >} > > @@ -2055,7 +2054,8 @@ > private final Map>, > Registration> combinedCache = > Maps.newConcurrentMap(); > > -private final Map> deserializerCache = > Maps.newConcurrentMap(); > +private final Map>> > deserializerCache = > +Maps.newConcurrentMap(); > > private final Map>> > serializerCache = > Maps.newConcurrentMap(); > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13930) Address StateSpec inconsistency between Runner and Fn API
[ https://issues.apache.org/jira/browse/BEAM-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13930: - Resolution: Fixed Status: Resolved (was: Open) > Address StateSpec inconsistency between Runner and Fn API > - > > Key: BEAM-13930 > URL: https://issues.apache.org/jira/browse/BEAM-13930 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.37.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The ability to mix and match runners and SDKs is accomplished through two > portability layers: > 1. The Runner API provides an SDK-and-runner-independent definition of a Beam > pipeline > 2. The Fn API allows a runner to invoke SDK-specific user-defined functions > Apache Beam pipelines support executing stateful DoFns[1]. To support this > execution the Runner API defines multiple user state specifications: > * ReadModifyWriteStateSpec > * BagStateSpec > * OrderedListStateSpec > * CombiningStateSpec > * MapStateSpec > * SetStateSpec > The Fn API[2] defines APIs[3] to get, append and clear user state currently > supporting a BagUserState and MultimapUserState protocol. > Since there is no clear mapping between the Runner API and Fn API state > specifications, there is no way for a runner to know that it supports a given > API necessary to support the execution of the pipeline. The Runner will also > have to manage additional runtime metadata associated with which protocol was > used for a type of state so that it can successfully manage the state’s > lifetime once it can be garbage collected. > Please see the doc[4] for further details and a proposal on how to address > this shortcoming. > 1: https://beam.apache.org/blog/stateful-processing/ > 2: > https://github.com/apache/beam/blob/3ad05523f4cdf5122fc319276fcb461f768af39d/model/fn-execution/src/main/proto/beam_fn_api.proto#L742 > 3: https://s.apache.org/beam-fn-state-api-and-bundle-processing > 4: http://doc/1ELKTuRTV3C5jt_YoBBwPdsPa5eoXCCOSKQ3GPzZrK7Q -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13930) Address StateSpec inconsistency between Runner and Fn API
[ https://issues.apache.org/jira/browse/BEAM-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13930: - Fix Version/s: 2.37.0 > Address StateSpec inconsistency between Runner and Fn API > - > > Key: BEAM-13930 > URL: https://issues.apache.org/jira/browse/BEAM-13930 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Fix For: 2.37.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The ability to mix and match runners and SDKs is accomplished through two > portability layers: > 1. The Runner API provides an SDK-and-runner-independent definition of a Beam > pipeline > 2. The Fn API allows a runner to invoke SDK-specific user-defined functions > Apache Beam pipelines support executing stateful DoFns[1]. To support this > execution the Runner API defines multiple user state specifications: > * ReadModifyWriteStateSpec > * BagStateSpec > * OrderedListStateSpec > * CombiningStateSpec > * MapStateSpec > * SetStateSpec > The Fn API[2] defines APIs[3] to get, append and clear user state currently > supporting a BagUserState and MultimapUserState protocol. > Since there is no clear mapping between the Runner API and Fn API state > specifications, there is no way for a runner to know that it supports a given > API necessary to support the execution of the pipeline. The Runner will also > have to manage additional runtime metadata associated with which protocol was > used for a type of state so that it can successfully manage the state’s > lifetime once it can be garbage collected. > Please see the doc[4] for further details and a proposal on how to address > this shortcoming. > 1: https://beam.apache.org/blog/stateful-processing/ > 2: > https://github.com/apache/beam/blob/3ad05523f4cdf5122fc319276fcb461f768af39d/model/fn-execution/src/main/proto/beam_fn_api.proto#L742 > 3: https://s.apache.org/beam-fn-state-api-and-bundle-processing > 4: http://doc/1ELKTuRTV3C5jt_YoBBwPdsPa5eoXCCOSKQ3GPzZrK7Q -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-13930) Address StateSpec inconsistency between Runner and Fn API
Luke Cwik created BEAM-13930: Summary: Address StateSpec inconsistency between Runner and Fn API Key: BEAM-13930 URL: https://issues.apache.org/jira/browse/BEAM-13930 Project: Beam Issue Type: Improvement Components: beam-model, sdk-java-core, sdk-py-core Reporter: Luke Cwik Assignee: Luke Cwik The ability to mix and match runners and SDKs is accomplished through two portability layers: 1. The Runner API provides an SDK-and-runner-independent definition of a Beam pipeline 2. The Fn API allows a runner to invoke SDK-specific user-defined functions Apache Beam pipelines support executing stateful DoFns[1]. To support this execution the Runner API defines multiple user state specifications: * ReadModifyWriteStateSpec * BagStateSpec * OrderedListStateSpec * CombiningStateSpec * MapStateSpec * SetStateSpec The Fn API[2] defines APIs[3] to get, append and clear user state currently supporting a BagUserState and MultimapUserState protocol. Since there is no clear mapping between the Runner API and Fn API state specifications, there is no way for a runner to know that it supports a given API necessary to support the execution of the pipeline. The Runner will also have to manage additional runtime metadata associated with which protocol was used for a type of state so that it can successfully manage the state’s lifetime once it can be garbage collected. Please see the doc[4] for further details and a proposal on how to address this shortcoming. 1: https://beam.apache.org/blog/stateful-processing/ 2: https://github.com/apache/beam/blob/3ad05523f4cdf5122fc319276fcb461f768af39d/model/fn-execution/src/main/proto/beam_fn_api.proto#L742 3: https://s.apache.org/beam-fn-state-api-and-bundle-processing 4: http://doc/1ELKTuRTV3C5jt_YoBBwPdsPa5eoXCCOSKQ3GPzZrK7Q -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13686) OOM while logging a large pipeline even when logging level is higher
[ https://issues.apache.org/jira/browse/BEAM-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13686: - Fix Version/s: 2.37.0 Resolution: Fixed Status: Resolved (was: Open) > OOM while logging a large pipeline even when logging level is higher > > > Key: BEAM-13686 > URL: https://issues.apache.org/jira/browse/BEAM-13686 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.35.0 >Reporter: Vitaly Ivanov >Assignee: Vitaly Ivanov >Priority: P1 > Fix For: 2.37.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Logging message is calculated even when logging level is higher, so there is > no WA to increase the logging level. > {noformat} > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at java.base/java.util.Arrays.copyOf(Arrays.java:3745) > at > java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172) > at > java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538) > at java.base/java.lang.StringBuilder.append(StringBuilder.java:174) > at java.base/java.lang.StringBuilder.append(StringBuilder.java:85) > at > java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:573) > at java.base/java.lang.StringBuilder.append(StringBuilder.java:204) > at java.base/java.lang.StringBuilder.append(StringBuilder.java:85) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$TextGenerator.print(TextFormat.java:860) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:574) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:743) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:443) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printMessage(TextFormat.java:705) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:353) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:597) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:743) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:443) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printMessage(TextFormat.java:705) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:353) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:597) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:743) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:435) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printMessage(TextFormat.java:705) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:353) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:597) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:743) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:443) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printMessage(TextFormat.java:705) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:353) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.print(TextFormat.java:339) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat$Printer.printToString(TextFormat.java:606) > at > org.apache.beam.vendor.grpc.v1p36p0.com.google.protobuf.TextFormat.printToString(TextFormat.java:149){noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-13814) Run errorprone with Java 11 (required in 2.11.0 and newer)
[ https://issues.apache.org/jira/browse/BEAM-13814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486779#comment-17486779 ] Luke Cwik commented on BEAM-13814: -- We could have a Java 11 specific PreCommit. Also, unless someone is using Java 11+, they won't be able to reproduce locally and only via PR so having a dedicated PreCommit might make sense for speed reasons. > Run errorprone with Java 11 (required in 2.11.0 and newer) > -- > > Key: BEAM-13814 > URL: https://issues.apache.org/jira/browse/BEAM-13814 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Brian Hulette >Priority: P2 > > errorprone 2.11.0 only supports JDK 11 (but can still validate Java 8 > classes). We should find a way to upgrade errorprone while still supporting > Java 8. > Perhaps we could modify BeamModulePlugin to disable errorprone when building > with Java 8, and make sure that there's a Java 11 PreCommit that will run it? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (BEAM-13801) Add standard_coders.yaml cases for beam:coder:state_backed_iterable:v1
[ https://issues.apache.org/jira/browse/BEAM-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-13801: Assignee: (was: Luke Cwik) > Add standard_coders.yaml cases for beam:coder:state_backed_iterable:v1 > -- > > Key: BEAM-13801 > URL: https://issues.apache.org/jira/browse/BEAM-13801 > Project: Beam > Issue Type: Test > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Priority: P2 > Time Spent: 3h 20m > Remaining Estimate: 0h > > We lack coverage for beam:coder:state_backed_iterable:v1 in > standard_coders.yaml -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-13801) Add standard_coders.yaml cases for beam:coder:state_backed_iterable:v1
[ https://issues.apache.org/jira/browse/BEAM-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486004#comment-17486004 ] Luke Cwik commented on BEAM-13801: -- Java and Python are now covered. Follow-up with coverage in Go. > Add standard_coders.yaml cases for beam:coder:state_backed_iterable:v1 > -- > > Key: BEAM-13801 > URL: https://issues.apache.org/jira/browse/BEAM-13801 > Project: Beam > Issue Type: Test > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > Time Spent: 3h 20m > Remaining Estimate: 0h > > We lack coverage for beam:coder:state_backed_iterable:v1 in > standard_coders.yaml -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (BEAM-13801) Add standard_coders.yaml cases for beam:coder:state_backed_iterable:v1
Luke Cwik created BEAM-13801: Summary: Add standard_coders.yaml cases for beam:coder:state_backed_iterable:v1 Key: BEAM-13801 URL: https://issues.apache.org/jira/browse/BEAM-13801 Project: Beam Issue Type: Test Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness Reporter: Luke Cwik Assignee: Luke Cwik We lack coverage for beam:coder:state_backed_iterable:v1 in standard_coders.yaml -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13777) confluent schema registry cache capacity
[ https://issues.apache.org/jira/browse/BEAM-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13777: - Fix Version/s: 2.37.0 Resolution: Fixed Status: Resolved (was: In Progress) > confluent schema registry cache capacity > > > Key: BEAM-13777 > URL: https://issues.apache.org/jira/browse/BEAM-13777 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Mostafa Aghajani >Assignee: Mostafa Aghajani >Priority: P2 > Fix For: 2.37.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Change cache capacity should be specified as input parameter instead of > default max integer. The usage can be quite different case by case and a > default Integer max value can lead to error like this depending on the setup: > {{Exception in thread "main" java.lang.OutOfMemoryError: Java heap space}} > Some documentation link on the parameter: > [https://docs.confluent.io/5.4.2/clients/confluent-kafka-dotnet/api/Confluent.SchemaRegistry.CachedSchemaRegistryClient.html#Confluent_SchemaRegistry_CachedSchemaRegistryClient_DefaultMaxCachedSchemas] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (BEAM-13353) KafkaIO should raise an error if both .withReadCommitted() and .commitOffsetsInFinalize() are used
[ https://issues.apache.org/jira/browse/BEAM-13353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478281#comment-17478281 ] Luke Cwik commented on BEAM-13353: -- Added details to the description. > KafkaIO should raise an error if both .withReadCommitted() and > .commitOffsetsInFinalize() are used > -- > > Key: BEAM-13353 > URL: https://issues.apache.org/jira/browse/BEAM-13353 > Project: Beam > Issue Type: Improvement > Components: io-java-kafka >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > > Read committed tells KafkaIO to only read messages that are already committed > which means that committing offsets in finalize is a no-op. > Users should be using one or the other and is similar to making sure that > either auto commit is enabled in the reader of KafkaIO does the committing > which we have error checking for already. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (BEAM-13353) KafkaIO should raise an error if both .withReadCommitted() and .commitOffsetsInFinalize() are used
[ https://issues.apache.org/jira/browse/BEAM-13353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-13353: - Description: Read committed tells KafkaIO to only read messages that are already committed which means that committing offsets in finalize is a no-op. Users should be using one or the other and is similar to making sure that either auto commit is enabled in the reader of KafkaIO does the committing which we have error checking for already. > KafkaIO should raise an error if both .withReadCommitted() and > .commitOffsetsInFinalize() are used > -- > > Key: BEAM-13353 > URL: https://issues.apache.org/jira/browse/BEAM-13353 > Project: Beam > Issue Type: Improvement > Components: io-java-kafka >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: P2 > > Read committed tells KafkaIO to only read messages that are already committed > which means that committing offsets in finalize is a no-op. > Users should be using one or the other and is similar to making sure that > either auto commit is enabled in the reader of KafkaIO does the committing > which we have error checking for already. -- This message was sent by Atlassian Jira (v8.20.1#820001)