Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Anton Okolnychyi
+1 On 2024/05/13 15:33:33 Ryan Blue wrote: > +1 > > On Mon, May 13, 2024 at 12:31 AM Mich Talebzadeh > wrote: > > > +0 > > > > For reasons I outlined in the discussion thread > > > > https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo > > > > Mich Talebzadeh, > > Technologist |

Re: [DISCUSS] SPIP: Stored Procedures API for Catalogs

2024-05-11 Thread Anton Okolnychyi
is correct to the best of my > knowledge but of course cannot be guaranteed . It is essential to note > that, as with any advice, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.w

Re: [DISCUSS] SPIP: Stored Procedures API for Catalogs

2024-05-09 Thread Anton Okolnychyi
Thanks to everyone who commented on the design doc. I updated the proposal and it is ready for another look. I hope we can converge and move forward with this effort! - Anton пт, 19 квіт. 2024 р. о 15:54 Anton Okolnychyi пише: > Hi folks, > > I'd like to start a discussion on SP

[DISCUSS] SPIP: Stored Procedures API for Catalogs

2024-04-19 Thread Anton Okolnychyi
Hi folks, I'd like to start a discussion on SPARK-44167 that aims to enable catalogs to expose custom routines as stored procedures. I believe this functionality will enhance Spark’s ability to interact with external connectors and allow users to perform more operations in plain SQL. SPIP [1]

Re: [VOTE] Release Apache Spark 3.4.0 (RC5)

2023-04-05 Thread Anton Okolnychyi
M Gengliang Wang wrote: > > > >> Hi Anton, > >> > >> +1 for adding the old constructors back! > >> Could you raise a PR for this? I will review it ASAP. > >> > >> Thanks > >> Gengliang > >> > >> On Wed, Apr 5, 2023 at 9

Re: [VOTE] Release Apache Spark 3.4.0 (RC5)

2023-04-05 Thread Anton Okolnychyi
Sorry, I think my last message did not land on the list. I have a question about changes to exceptions used in the public connector API, such as NoSuchTableException and TableAlreadyExistsException. I consider those as part of the public Catalog API (TableCatalog uses them in method

Re: [VOTE] SPIP: Row-level operations in Data Source V2

2021-11-12 Thread Anton Okolnychyi
+1 from me too to indicate my commitment (non-binding) - Anton > On 12 Nov 2021, at 18:27, Liang Chi Hsieh wrote: > > I’d vote my +1 first. > > On 2021/11/13 02:25:05 "L. C. Hsieh" wrote: >> Hi all, >> >> I’d like to start a vote for SPIP: Row-level operations in Data Source V2. >> >> The

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-11-12 Thread Anton Okolnychyi
e to shepherd a SPIP, so please let me know if > anything I can > > >> > > improve. > > >> > > > > >> > > This looks great features and the rationale claimed by the > proposal makes > > >> > > sense. These operations are g

[DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-24 Thread Anton Okolnychyi
Hey everyone, I'd like to start a discussion on adding support for executing row-level operations such as DELETE, UPDATE, MERGE for v2 tables (SPARK-35801). The execution should be the same across data sources and the best way to do that is to implement it in Spark. Right now, Spark can only

[DISCUSS][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-03-06 Thread Anton Okolnychyi
Hi devs, I want to follow up on the dev list discussion [1] and the JIRA issue [2] created as a result of it and propose a

Re: [VOTE] Release Apache Spark 2.3.3 (RC1)

2019-01-23 Thread Anton Okolnychyi
t does not need to go into 2.3.3. If it's a > real bug, sure it can be merged to 2.3.x. > > On Wed, Jan 23, 2019 at 7:54 AM Anton Okolnychyi > wrote: > > > > Recently, I came across this bug: > https://issues.apache.org/jira/browse/SPARK-26706. > > > > It seem

Re: [VOTE] Release Apache Spark 2.3.3 (RC1)

2019-01-23 Thread Anton Okolnychyi
Recently, I came across this bug: https://issues.apache.org/jira/browse/SPARK-26706. It seems appropriate to include it in 2.3.3, doesn't it? Thanks, Anton ср, 23 янв. 2019 г. в 13:08, Takeshi Yamamuro : > Thanks for the check, Felix! > > Yea, I'll wait for the new test report. > But, it never

[SS] Custom Sinks

2017-11-01 Thread Anton Okolnychyi
Hi all, I have a question about the future of custom data sinks in Structured Streaming. In particular, I want to know how continuous processing and the Datasource API V2 will impact them. Right now, it is possible to have custom data sinks via the current Datasource API (V1) by implementing

[SQL] Return Type of Round Func

2017-07-04 Thread Anton Okolnychyi
Hi all, I have a question regarding the round() function, which was developed long time ago as SPARK-8159. Currently, the return type is exactly as the input type. That is reasonable, but does not match with Hive. As I understand, Hive produces either double or decimal as output (see here

Re: [Spark SQL] Nanoseconds in Timestamps are set as Microseconds

2017-06-02 Thread Anton Okolnychyi
Then let me provide a PR so that we can discuss an alternative way 2017-06-02 8:26 GMT+02:00 Reynold Xin <r...@databricks.com>: > Seems like a bug we should fix? I agree some form of truncation makes more > sense. > > > On Thu, Jun 1, 2017 at 1:17 AM, Anton Okolnyc

[Spark SQL] Nanoseconds in Timestamps are set as Microseconds

2017-06-01 Thread Anton Okolnychyi
Hi all, I would like to ask what the community thinks regarding the way how Spark handles nanoseconds in the Timestamp type. As far as I see in the code, Spark assumes microseconds precision. Therefore, I expect to have a truncated to microseconds timestamp or an exception if I specify a

Re: [Spark SQL] ceil and floor functions on doubles

2017-05-19 Thread Anton Okolnychyi
hive> select 9.223372036854786E20, ceil(9.223372036854786E20); > > OK > > _c0 _c1 > > 9.223372036854786E20 9223372036854775807 > > Time taken: 2.041 seconds, Fetched: 1 row(s) > > > > Bests, > > Dongjoon. > > > > *From: *Anton Okol

[Spark SQL] ceil and floor functions on doubles

2017-05-19 Thread Anton Okolnychyi
Hi all, I am wondering why the results of ceil and floor functions on doubles are internally casted to longs. This causes loss of precision since doubles can hold bigger numbers. Consider the following example: // 9.223372036854786E20 is greater than Long.MaxValue val df =

[SPARK-16046] PR Review

2017-01-24 Thread Anton Okolnychyi
Hi all, there is a pull request that I would like to bring back to life. It is related to the SQL programming guide and can be found here . I believe the PR should be helpful. The initial review is done already. Also, I updated it recently and checked

Re: Expand the Spark SQL programming guide?

2016-12-18 Thread Anton Okolnychyi
Any comments/suggestions are more than welcome. Thanks, Anton 2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <anton.okolnyc...@gmail.com>: > Here is the pull request: https://github.com/apache/spark/pull/16329 > > > > 2016-12-16 20:54 GMT+01:00 Jim Hughes <jn...@ccri

Re: Expand the Spark SQL programming guide?

2016-12-18 Thread Anton Okolnychyi
ld > be fine. > > Thanks! > > > On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote: > > Yes - that sounds good Anton, I can work on documenting the window > functions. > > > > *From: *Anton Okolnychyi <anton.okolnyc...@gmail.com> > <anton.okolnyc

Re: Expand the Spark SQL programming guide?

2016-12-15 Thread Anton Okolnychyi
eospatial user-defined types and functions. Having examples of > aggregations and window functions would be awesome! > > I did test out implementing a distributed convex hull as a > UserDefinedAggregateFunction, and that seemed to work sensibly. > > Cheers, > > Jim > &g

Expand the Spark SQL programming guide?

2016-12-15 Thread Anton Okolnychyi
Hi, I am wondering whether it makes sense to expand the Spark SQL programming guide with examples of aggregations (including user-defined via the Aggregator API) and window functions. For instance, there might be a separate subsection under "Getting Started" for each functionality. SPARK-16046

Typo in the programming guide?

2016-11-27 Thread Anton Okolnychyi
Hi guys, I am looking at the Accumulator section in the latest programming guide. Is there a typo in the sample code? Shouldn't the add() method accept only one param in Spark 2.0? It looks like the signature is inherited from AccumulatorParam, which was there before. object VectorAccumulatorV2

Fwd:

2016-11-15 Thread Anton Okolnychyi
Hi, I have experienced a problem using the Datasets API in Spark 1.6, while almost identical code works fine in Spark 2.0. The problem is related to encoders and custom aggregators. *Spark 1.6 (the aggregation produces an empty map):* implicit val intStringMapEncoder: Encoder[Map[Int,

Code Style Formatting

2016-07-01 Thread Anton Okolnychyi
configurations that I can import to IntelliJ IDEA to adjust hot it does the formatting. Is it possible to avoid the manual configuration? Best regards, Anton Okolnychyi