[jira] [Commented] (FLINK-20578) Cannot create empty array using ARRAY[]
[ https://issues.apache.org/jira/browse/FLINK-20578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827826#comment-17827826 ] Nathan Taylor Armstrong Lewis commented on FLINK-20578: --- Does anyone know of a workaround to create an empty array literal until this issue is addressed? > Cannot create empty array using ARRAY[] > --- > > Key: FLINK-20578 > URL: https://issues.apache.org/jira/browse/FLINK-20578 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Fabian Hueske >Assignee: Eric Xiao >Priority: Major > Labels: pull-request-available, stale-assigned, starter > Fix For: 1.20.0 > > Attachments: Screen Shot 2022-10-25 at 10.50.42 PM.png, Screen Shot > 2022-10-25 at 10.50.47 PM.png, Screen Shot 2022-10-25 at 11.01.06 PM.png, > Screen Shot 2022-10-26 at 2.28.49 PM.png, image-2022-10-26-14-42-08-468.png, > image-2022-10-26-14-42-57-579.png > > > Calling the ARRAY function without an element (`ARRAY[]`) results in an error > message. > Is that the expected behavior? > How can users create empty arrays? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33817) Allow ReadDefaultValues = False for non primitive types on Proto3
[ https://issues.apache.org/jira/browse/FLINK-33817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819243#comment-17819243 ] Nathan Taylor Armstrong Lewis commented on FLINK-33817: --- [~libenchao], yes that should work. We are currently using a fork with this fix cherry picked in, so we can stay on that until the latest development version goes stable. (y) > Allow ReadDefaultValues = False for non primitive types on Proto3 > - > > Key: FLINK-33817 > URL: https://issues.apache.org/jira/browse/FLINK-33817 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.18.0 >Reporter: Sai Sharath Dandi >Priority: Major > Labels: pull-request-available > > *Background* > > The current Protobuf format > [implementation|https://github.com/apache/flink/blob/c3e2d163a637dca5f49522721109161bd7ebb723/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java] > always sets ReadDefaultValues=False when using Proto3 version. This can > cause severe performance degradation for large Protobuf schemas with OneOf > fields as the entire generated code needs to be executed during > deserialization even when certain fields are not present in the data to be > deserialized and all the subsequent nested Fields can be skipped. Proto3 > supports hasXXX() methods for checking field presence for non primitive types > since Proto version > [3.15|https://github.com/protocolbuffers/protobuf/releases/tag/v3.15.0]. In > the internal performance benchmarks in our company, we've seen almost 10x > difference in performance for one of our real production usecase when > allowing to set ReadDefaultValues=False with proto3 version. The exact > difference in performance depends on the schema complexity and data payload > but we should allow user to set readDefaultValue=False in general. > > *Solution* > > Support using ReadDefaultValues=False when using Proto3 version. We need to > be careful to check for field presence only on non-primitive types if > ReadDefaultValues is false and version used is Proto3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-33817) Allow ReadDefaultValues = False for non primitive types on Proto3
[ https://issues.apache.org/jira/browse/FLINK-33817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817008#comment-17817008 ] Nathan Taylor Armstrong Lewis edited comment on FLINK-33817 at 2/13/24 1:56 PM: I can confirm that this issue affects Flink version 1.17.x as well. was (Author: JIRAUSER304121): I can confirm that this issue affects version 1.17.x as well. > Allow ReadDefaultValues = False for non primitive types on Proto3 > - > > Key: FLINK-33817 > URL: https://issues.apache.org/jira/browse/FLINK-33817 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.18.0 >Reporter: Sai Sharath Dandi >Priority: Major > Labels: pull-request-available > > *Background* > > The current Protobuf format > [implementation|https://github.com/apache/flink/blob/c3e2d163a637dca5f49522721109161bd7ebb723/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java] > always sets ReadDefaultValues=False when using Proto3 version. This can > cause severe performance degradation for large Protobuf schemas with OneOf > fields as the entire generated code needs to be executed during > deserialization even when certain fields are not present in the data to be > deserialized and all the subsequent nested Fields can be skipped. Proto3 > supports hasXXX() methods for checking field presence for non primitive types > since Proto version > [3.15|https://github.com/protocolbuffers/protobuf/releases/tag/v3.15.0]. In > the internal performance benchmarks in our company, we've seen almost 10x > difference in performance for one of our real production usecase when > allowing to set ReadDefaultValues=False with proto3 version. The exact > difference in performance depends on the schema complexity and data payload > but we should allow user to set readDefaultValue=False in general. > > *Solution* > > Support using ReadDefaultValues=False when using Proto3 version. We need to > be careful to check for field presence only on non-primitive types if > ReadDefaultValues is false and version used is Proto3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33817) Allow ReadDefaultValues = False for non primitive types on Proto3
[ https://issues.apache.org/jira/browse/FLINK-33817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817008#comment-17817008 ] Nathan Taylor Armstrong Lewis commented on FLINK-33817: --- I can confirm that this issue affects version 1.17.x as well. > Allow ReadDefaultValues = False for non primitive types on Proto3 > - > > Key: FLINK-33817 > URL: https://issues.apache.org/jira/browse/FLINK-33817 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Affects Versions: 1.18.0 >Reporter: Sai Sharath Dandi >Priority: Major > Labels: pull-request-available > > *Background* > > The current Protobuf format > [implementation|https://github.com/apache/flink/blob/c3e2d163a637dca5f49522721109161bd7ebb723/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java] > always sets ReadDefaultValues=False when using Proto3 version. This can > cause severe performance degradation for large Protobuf schemas with OneOf > fields as the entire generated code needs to be executed during > deserialization even when certain fields are not present in the data to be > deserialized and all the subsequent nested Fields can be skipped. Proto3 > supports hasXXX() methods for checking field presence for non primitive types > since Proto version > [3.15|https://github.com/protocolbuffers/protobuf/releases/tag/v3.15.0]. In > the internal performance benchmarks in our company, we've seen almost 10x > difference in performance for one of our real production usecase when > allowing to set ReadDefaultValues=False with proto3 version. The exact > difference in performance depends on the schema complexity and data payload > but we should allow user to set readDefaultValue=False in general. > > *Solution* > > Support using ReadDefaultValues=False when using Proto3 version. We need to > be careful to check for field presence only on non-primitive types if > ReadDefaultValues is false and version used is Proto3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28747) "target_id can not be missing" in HTTP statefun request
[ https://issues.apache.org/jira/browse/FLINK-28747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816124#comment-17816124 ] Nathan Taylor Armstrong Lewis commented on FLINK-28747: --- In Protobuf 3, there is an `optional` label. An unset field could then be distinguished from a field that was set to the default value. Would adding {{optional}} to https://github.com/apache/flink-statefun/blob/accd75ea0109845c4b4c0ddd74021147af1439d4/statefun-sdk-protos/src/main/protobuf/io/kafka-egress.proto#L28 be enough to provide the SDKs with a way to distinguish between a valid empty string key vs. an invalid unset key? I'm guessing there would have to be other changes elsewhere since that file is for the egress and I don't see any equivalent protobuf file for kafka ingress messages. > "target_id can not be missing" in HTTP statefun request > --- > > Key: FLINK-28747 > URL: https://issues.apache.org/jira/browse/FLINK-28747 > Project: Flink > Issue Type: Bug > Components: Stateful Functions >Affects Versions: statefun-3.0.0, statefun-3.2.0, statefun-3.1.1 >Reporter: Stephan Weinwurm >Priority: Major > > Hi all, > We've suddenly started to see the following exception in our HTTP statefun > functions endpoints: > {code}Traceback (most recent call last): > File > "/src/.venv/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", > line 403, in run_asgi > result = await app(self.scope, self.receive, self.send) > File > "/src/.venv/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", > line 78, in __call__ > return await self.app(scope, receive, send) > File "/src/worker/baseplate_asgi/asgi/baseplate_asgi_middleware.py", line > 37, in __call__ > await span_processor.execute() > File "/src/worker/baseplate_asgi/asgi/asgi_http_span_processor.py", line > 61, in execute > raise e > File "/src/worker/baseplate_asgi/asgi/asgi_http_span_processor.py", line > 57, in execute > await self.app(self.scope, self.receive, self.send) > File "/src/.venv/lib/python3.9/site-packages/starlette/applications.py", > line 124, in __call__ > await self.middleware_stack(scope, receive, send) > File > "/src/.venv/lib/python3.9/site-packages/starlette/middleware/errors.py", line > 184, in __call__ > raise exc > File > "/src/.venv/lib/python3.9/site-packages/starlette/middleware/errors.py", line > 162, in __call__ > await self.app(scope, receive, _send) > File > "/src/.venv/lib/python3.9/site-packages/starlette/middleware/exceptions.py", > line 75, in __call__ > raise exc > File > "/src/.venv/lib/python3.9/site-packages/starlette/middleware/exceptions.py", > line 64, in __call__ > await self.app(scope, receive, sender) > File "/src/.venv/lib/python3.9/site-packages/starlette/routing.py", line > 680, in __call__ > await route.handle(scope, receive, send) > File "/src/.venv/lib/python3.9/site-packages/starlette/routing.py", line > 275, in handle > await self.app(scope, receive, send) > File "/src/.venv/lib/python3.9/site-packages/starlette/routing.py", line > 65, in app > response = await func(request) > File "/src/worker/baseplate_statefun/server/asgi/make_statefun_handler.py", > line 25, in statefun_handler > result = await handler.handle_async(request_body) > File "/src/.venv/lib/python3.9/site-packages/statefun/request_reply_v3.py", > line 262, in handle_async > msg = Message(target_typename=sdk_address.typename, > target_id=sdk_address.id, > File "/src/.venv/lib/python3.9/site-packages/statefun/messages.py", line > 42, in __init__ > raise ValueError("target_id can not be missing"){code} > Interestingly, this has started to happen in three separate Flink deployments > at the very same time. The only thing in common between the three deployments > is that they consume the same Kafka topics. > No deployments have happened when the issue started happening which was on > July 28th 3:05PM. We have since been continuously seeing the error. > We were also able to extract the request that Flink sends to the HTTP > statefun endpoint: > {code}{'invocation': {'target': {'namespace': 'com.x.dummy', 'type': > 'dummy'}, 'invocations': [{'argument': {'typename': > 'type.googleapis.com/v2_event.Event', 'has_value': True, 'value': > '-redicated-'}}]}} > {code} > As you can see, no `id` field is present in the `invocation.target` object or > the `target_id` was an empty string. > > This is our module.yaml from one of the Flink deployments: > > {code} > version: "3.0" > module: > meta: > type: remote > spec: > endpoints: > - endpoint: > meta: > kind: io.statefun.endpoints.v1/http > spec: > functions: com.x.dummy/dummy > urlPathTemplate:
[jira] [Commented] (FLINK-21227) Upgrade Protobof 3.7.0 for (power)ppc64le support
[ https://issues.apache.org/jira/browse/FLINK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816116#comment-17816116 ] Nathan Taylor Armstrong Lewis commented on FLINK-21227: --- Is there any particular reason not to use the {{$\{protoc.version\}}} property that is defined in [https://github.com/apache/flink/blob/d2abd744621c6f0f65e7154a2c1b53bcaf78e90b/pom.xml#L161] to keep the version of protoc used consistent across the repo? Specifically that might look something like changing [https://github.com/bivasda1/flink/blob/0d5ea7bccf8847b3fdc2049c381764b08dc895e9/flink-formats/flink-parquet/pom.xml#L253] to: {code:java} com.google.protobuf:protoc:${protoc.version}:exe:${os.detected.classifier} {code} I'm not familiar with parquet, so this might be a horrible idea due to something parquet specific. > Upgrade Protobof 3.7.0 for (power)ppc64le support > - > > Key: FLINK-21227 > URL: https://issues.apache.org/jira/browse/FLINK-21227 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Bivas >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > com.google.protobuf:*protoc:3.5.1:exe* was not supported by power. Later > versions released multi-arch support including power(ppc64le).Using > *protoc:3.7.0:exe* able to build and E2E tests passed successfully. > https://github.com/bivasda1/flink/blob/master/flink-formats/flink-parquet/pom.xml#L253 -- This message was sent by Atlassian Jira (v8.20.10#820010)