Thank You for the reply and a hint.
1. Yes did try with Calcite `ROW` too - `java.lang.NoSuchFieldException:
head (state=,code=0)` but on the transformation side `SELECT * FROM
etl_raw LIMIT 1`. Maybe I need to directly refer to a field that I need
instead of using `*`? Do You know from top of your head whats the syntax
to get `repo_state.head.commit` value? Did try also to add more fields
to the table def but no luck.
2.
> doesn't actually do anything on the SQL shell at query parse time
This is also my observation. What is the proper way to initialize Zeta?
Did try `./bin/shell
--plannerName=org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner`
and few other variations but seems like sqlline does not allow for
passing parameters to pipeline options.
I do not have any requirement specific for Zeta but it seems it has a
bit more support for JSON functions than Calcite.
3. Any other idea to workaround this issue?
Thanks for support!
For context I do try to setup some POC with this SQLs as it could really
speed up development and maintenance of our pipelines + SQL has low
requirements regarding knowledge to actually understand the logic (read
the code). If this works it might be a great (killer) feature for Beam.
Best
Wisniowski Piotr
On 20.04.2023 20:52, Andrew Pilloud via user wrote:
set plannerName doesn't actually do anything on the SQL shell at query
parse time, it will still use the calcite parser. Have you tried
calcite SQL?
Support for struts is somewhat limited. I know there are bugs around
nested structs and structs with single values.
Andrew
On Thu, Apr 20, 2023 at 9:26 AM Wiśniowski Piotr
<contact.wisniowskipi...@gmail.com> wrote:
Hi,
I have a question regarding usage of Zeta with SQL extensions in SQL
shell. I try to:
```
SET runner = DirectRunner;
SET tempLocation = `/tmp/test/`;
SET streaming=`True`;
SET plannerName =
`org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner`;
CREATE EXTERNAL TABLE etl_raw(
event_timestamp TIMESTAMP,
event_type VARCHAR,
message_id VARCHAR,
tracking_source VARCHAR,
tracking_version VARCHAR,
`repo_state` STRUCT<`head` STRUCT<`commit` VARCHAR ,`name`
VARCHAR>>
)
TYPE pubsub
LOCATION 'projects/xxx/topics/xxx'
TBLPROPERTIES '{"format":"json"}';
```
But get error `parse failed: Encountered "STRUCT" `.
If i change the `STRUCT` to `ROW` (as in Calcite) the DDL passes, but
still I do fail to receive data on
`SELECT * FROM etl_raw LIMIT 1;` with exception of
`java.lang.NoSuchFieldException: head (state=,code=0)` when I am sure
that the field is there in json payload.
With commented out `repo_state` filed I am able to retrieve the data.
Unfortunately I do not have control over the payload structure as its
3rd party hook to make it flat.
In general I am unable to parse json msg from pubsub having
structured
field.
Is anyone familiar with this part of Beam functionalities?
Best regards
Wisniowski Piotr