Unsubscribe

2023-08-03 Thread Denys Cherepanin
Unsubscribe


Re: Query hints visible to DSV2 connectors?

2023-08-03 Thread Ryan Blue
You probably want to use data source options. Those get passed through but
can't be set in SQL.

On Wed, Aug 2, 2023 at 5:39 PM Alex Cruise  wrote:

> Hey folks,
>
> I'm adding an optional feature to my DSV2 connector where it can choose
> between a row-based or columnar PartitionReader dynamically depending on a
> query's schema. I'd like to be able to supply a hint at query time that's
> visible to the connector, but at the moment I can't see any way to
> accomplish that.
>
> From what I can see the artifacts produced by the existing hint system [
> https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-hints.html
> or sql("select 1").hint("foo").show()] aren't visible from the
> TableCatalog/Table/ScanBuilder.
>
> I guess I could set a config parameter but I'd rather do this on a
> per-query basis. Any tips?
>
> Thanks!
>
> -0xe1a
>


-- 
Ryan Blue
Tabular


Re: LLM script for error message improvement

2023-08-03 Thread Maciej
I am sitting on the fence about that. In the linked PR Xiao wrote the 
following
We published the error guideline a few years ago, but not all 

contributors adhered to it, resulting in variable quality in error messages.
If a policy exists but is not enforced (if that's indeed the case, I 
didn't go through the source to confirm that) it might be useful to 
learn the reasons why it happens. Normally, I'd expect
-Policy is too complex to enforce. In such case, additional tooling can 
be useful.
-Policy is not well known, and the people responsible for introducing it 
are not committed to enforcing it.
-Policy or some of its components don't really reflect community values 
and expectations.
If the problem of suspected violations was never raised on our standard 
communication channel, and as far as I can tell, it has not, then 
introducing a new tool to enforce the policy seems a bit premature.
If these were the only considerations, I'd say that improving the 
overall consistency of the project outweighs possible risks, even if the 
case for such might be poorly supported.
However, there is an elephant in the room. It is another attempt, after 
SPARK-44546, to embed generative tools directly within the Spark dev 
workflow. By principle, I am not against such tools. In fact, it is 
pretty clear that they are already used by Spark committers, and even if 
we wanted to, there is little we can do to prevent that. In such cases, 
decisions which tools, if any, to use, to what extent and how to treat 
their output are the sole responsibility of contributors.
In contrast, these proposals try to push a proprietary tool burdened 
with serious privacy and ethical issues and likely to introduce unclear 
liabilities as a standard or even required developer tool.
I can't speak for others, but personally, I'm quite uneasy about it. If 
we go this way, I strongly believe that it should be preceded by a 
serious discussion, if not the development of a formal policy, about 
what categories of tools, to what capacity, to what extent are 
acceptable within the project. Ideally, with an official opinion from 
the ASF as the copyright owner.

WDYT All? Shall we start a separate discussion?

Best regards,
Maciej Szymkiewicz

Web:https://zero323.net
PGP: A30CEF0C31A501EC

On 8/3/23 18:33, Haejoon Lee wrote:


Additional information:

Please check https://issues.apache.org/jira/browse/SPARK-37935if you 
want to start contributing to improving error messages.


You can create sub-tasks if you believe there are error messages that 
need improvement, in addition to the tasks listed in the umbrella JIRA.


You can also refer to https://github.com/apache/spark/pull/41504, 
https://github.com/apache/spark/pull/41455as an example PR.



On Thu, Aug 3, 2023 at 1:10 PM Ruifeng Zheng  wrote:

+1 from my side, I'm fine to have it as a helper script

On Thu, Aug 3, 2023 at 10:53 AM Hyukjin Kwon
 wrote:

I think adding that dev tool script to improve the error
message is fine.

On Thu, 3 Aug 2023 at 10:24, Haejoon Lee
 wrote:

Dear contributors, I hope you are doing well!

I see there are contributors who are interested in working
on error message improvements and persistent contribution,
so I want to share an llm-based error message improvement
script for helping your contribution.

You can find a detail for the script at
https://github.com/apache/spark/pull/41711. I believe this
can help your error message improvement work, so I
encourage you to take a look at the pull request and
leverage the script.

Please let me know if you have any questions or concerns.

Thanks all for your time and contributions!

Best regards,

Haejoon



OpenPGP_signature
Description: OpenPGP digital signature


Re: LLM script for error message improvement

2023-08-03 Thread Haejoon Lee
Additional information:

Please check https://issues.apache.org/jira/browse/SPARK-37935 if you want
to start contributing to improving error messages.

You can create sub-tasks if you believe there are error messages that need
improvement, in addition to the tasks listed in the umbrella JIRA.

You can also refer to https://github.com/apache/spark/pull/41504,
https://github.com/apache/spark/pull/41455 as an example PR.

On Thu, Aug 3, 2023 at 1:10 PM Ruifeng Zheng  wrote:

> +1 from my side, I'm fine to have it as a helper script
>
> On Thu, Aug 3, 2023 at 10:53 AM Hyukjin Kwon  wrote:
>
>> I think adding that dev tool script to improve the error message is fine.
>>
>> On Thu, 3 Aug 2023 at 10:24, Haejoon Lee
>>  wrote:
>>
>>> Dear contributors, I hope you are doing well!
>>>
>>> I see there are contributors who are interested in working on error
>>> message improvements and persistent contribution, so I want to share an
>>> llm-based error message improvement script for helping your contribution.
>>>
>>> You can find a detail for the script at
>>> https://github.com/apache/spark/pull/41711. I believe this can help
>>> your error message improvement work, so I encourage you to take a look at
>>> the pull request and leverage the script.
>>>
>>> Please let me know if you have any questions or concerns.
>>>
>>> Thanks all for your time and contributions!
>>>
>>> Best regards,
>>>
>>> Haejoon
>>>
>>