[Question] Does Apache SnowflakeIO support Apache Parquet?

2024-03-04 Thread Xinmin
Hello guru,

We would like to write data to Snowflake with S3 bucket, basically, the
connector uploads the file in the CSV format to S3 bucket before copying
data into the table. Please refer to below.

[image: image.png]

Do you have any plans to support Apache Parquet? Thanks.

Regards,
Xinmin


Re: Roadmap of Calcite support on Beam SQL?

2024-03-04 Thread Robert Bradshaw via user
There is no longer a huge amount of active development going on here,
but implementing a missing function seems like an easy contribution
(lots of examples to follow). Otherwise, definitely worth filing a
feature request as a useful signal for prioritization.

On Mon, Mar 4, 2024 at 4:33 PM Jaehyeon Kim  wrote:
>
> Hi Wiśniowski
>
> Thank you so much for your reply. The query works for me. As I'm new to Beam 
> SQL, I'd spend some more time before issuing a feature request.
>
> Cheers,
> Jaehyeon
>
> On Mon, 4 Mar 2024 at 23:03, Wiśniowski Piotr 
>  wrote:
>>
>> Hi,
>>
>> 1. I do not have up to date knowledge, but Beam sql was missing quite a lot 
>> of things regarding Calcite full support. I think the current way is to 
>> create a feature request on repository and get votes and interest. I 
>> definitely would vote for You initiative ;)
>>
>> 2. Regarding the query itself I got it working for something like this:
>> ```
>>
>> WITH cte AS (
>> SELECT CAST(event_datetime AS TIMESTAMP) AS ts
>> FROM PCOLLECTION
>> )
>> SELECT
>> CAST(TUMBLE_START(cte.ts, INTERVAL '1' MINUTE) AS VARCHAR) AS start_time,
>> CAST(TUMBLE_END(cte.ts, INTERVAL '1' MINUTE) AS VARCHAR) AS end_time,
>> COUNT(*) AS page_views
>> FROM cte
>> GROUP BY
>> TUMBLE(cte.ts, INTERVAL '1' MINUTE)
>> ;
>>
>> ```
>>
>> Maybe it would be useful for you. Note that I am not up to date with recent 
>> versions of Beam SQL, but I will need to catch up (the syntax for defining 
>> window on table is cool).
>>
>> Best
>>
>> Wiśniowski Piotr
>>
>> On 4.03.2024 05:27, Jaehyeon Kim wrote:
>>
>> Hello,
>>
>> I just tried a simple tumbling window but failed with the following error
>>
>> RuntimeError: org.apache.beam.sdk.extensions.sql.impl.ParseException: Unable 
>> to parse query
>> WITH cte AS (
>> SELECT TO_TIMESTAMP(event_datetime) AS ts FROM PCOLLECTION
>> )
>> SELECT
>> CAST(window_start AS VARCHAR) AS start_time,
>> CAST(window_end AS VARCHAR) AS end_time,
>> COUNT(*) AS page_views
>> FROM TABLE(
>> TUMBLE(TABLE cte, DESCRIPTOR(ts), 'INTERVAL 1 MINUTE')
>> )
>> GROUP BY
>> window_start, window_end
>>
>> I guess it is because TO_TIMESTAMP is not implemented. When I check the 
>> document, it misses lots of functions. Is there any roadmap about Calcite 
>> support on Beam SQL?
>>
>> Cheers,
>> Jaehyeon


Re: Roadmap of Calcite support on Beam SQL?

2024-03-04 Thread Jaehyeon Kim
Hi Wiśniowski

Thank you so much for your reply. The query works for me. As I'm new to
Beam SQL, I'd spend some more time before issuing a feature request.

Cheers,
Jaehyeon

On Mon, 4 Mar 2024 at 23:03, Wiśniowski Piotr <
contact.wisniowskipi...@gmail.com> wrote:

> Hi,
>
> 1. I do not have up to date knowledge, but Beam sql was missing quite a
> lot of things regarding Calcite full support. I think the current way is to
> create a feature request on repository and get votes and interest. I
> definitely would vote for You initiative ;)
>
> 2. Regarding the query itself I got it working for something like this:
> ```
> WITH cte AS (
> SELECT CAST(event_datetime AS TIMESTAMP) AS ts
> FROM PCOLLECTION
> )
> SELECT
> CAST(TUMBLE_START(cte.ts, INTERVAL '1' MINUTE) AS VARCHAR) AS start_time,
> CAST(TUMBLE_END(cte.ts, INTERVAL '1' MINUTE) AS VARCHAR) AS end_time,
> COUNT(*) AS page_views
> FROM cte
> GROUP BY
> TUMBLE(cte.ts, INTERVAL '1' MINUTE)
> ;
>
> ```
>
> Maybe it would be useful for you. Note that I am not up to date with
> recent versions of Beam SQL, but I will need to catch up (the syntax for
> defining window on table is cool).
>
> Best
>
> Wiśniowski Piotr
> On 4.03.2024 05:27, Jaehyeon Kim wrote:
>
> Hello,
>
> I just tried a simple tumbling window but failed with the following error
>
> RuntimeError: org.apache.beam.sdk.extensions.sql.impl.ParseException:
> Unable to parse query
> WITH cte AS (
> SELECT TO_TIMESTAMP(event_datetime) AS ts FROM PCOLLECTION
> )
> SELECT
> CAST(window_start AS VARCHAR) AS start_time,
> CAST(window_end AS VARCHAR) AS end_time,
> COUNT(*) AS page_views
> FROM TABLE(
> TUMBLE(TABLE cte, DESCRIPTOR(ts), 'INTERVAL 1 MINUTE')
> )
> GROUP BY
> window_start, window_end
>
> I guess it is because TO_TIMESTAMP is not implemented. When I check the
> document, it misses lots of functions. Is there any roadmap about Calcite
> support on Beam SQL?
>
> Cheers,
> Jaehyeon
>
>


Re: Roadmap of Calcite support on Beam SQL?

2024-03-04 Thread Wiśniowski Piotr

Hi,

1. I do not have up to date knowledge, but Beam sql was missing quite a 
lot of things regarding Calcite full support. I think the current way is 
to create a feature request on repository and get votes and interest. I 
definitely would vote for You initiative ;)


2. Regarding the query itself I got it working for something like this:
```

WITHcte AS(
SELECTCAST(event_datetime ASTIMESTAMP) ASts
FROMPCOLLECTION
)
SELECT
CAST(TUMBLE_START(cte.ts, INTERVAL '1'MINUTE) ASVARCHAR) ASstart_time,
CAST(TUMBLE_END(cte.ts, INTERVAL '1'MINUTE) ASVARCHAR) ASend_time,
COUNT(*) ASpage_views
FROMcte
GROUP BY
TUMBLE(cte.ts, INTERVAL '1'MINUTE)
;

```

Maybe it would be useful for you. Note that I am not up to date with 
recent versions of Beam SQL, but I will need to catch up (the syntax for 
defining window on table is cool).


Best

Wiśniowski Piotr

On 4.03.2024 05:27, Jaehyeon Kim wrote:

Hello,

I just tried a simple tumbling window but failed with the following error

RuntimeError: org.apache.beam.sdk.extensions.sql.impl.ParseException: 
Unable to parse query

    WITH cte AS (
        SELECT TO_TIMESTAMP(event_datetime) AS ts FROM PCOLLECTION
    )
    SELECT
        CAST(window_start AS VARCHAR) AS start_time,
        CAST(window_end AS VARCHAR) AS end_time,
        COUNT(*) AS page_views
    FROM TABLE(
            TUMBLE(TABLE cte, DESCRIPTOR(ts), 'INTERVAL 1 MINUTE')
        )
    GROUP BY
        window_start, window_end

I guess it is because TO_TIMESTAMP is not implemented. When I check 
the document, it misses lots of functions. Is there any roadmap about 
Calcite support on Beam SQL?


Cheers,
Jaehyeon