Re: More publicly documenting the options under spark.sql.*

2020-02-09 Thread Hyukjin Kwon
The PR was merged. Now all external SQL configurations will be
automatically documented.

2020년 2월 5일 (수) 오전 9:46, Hyukjin Kwon 님이 작성:

> FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks
> Nicholas.
> Hope guys find some time to take a look.
>
> 2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas 님이
> 작성:
>
>> I am! Thanks for the reference.
>>
>> On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon  wrote:
>>
>>> Nicholas, are you interested in taking a stab at this? You could refer
>>> https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3
>>>
>>> 2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro 님이 작성:
>>>
 The idea looks nice. I think web documents always help end users.

 Bests,
 Takeshi

 On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <
 shixi...@databricks.com> wrote:

> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
> configurations. Should be pretty easy to automatically generate a SQL
> configuration page.
>
> Best Regards,
> Ryan
>
>
> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon 
> wrote:
>
>> I think automatically creating a configuration page isn't a bad idea
>> because I think we deprecate and remove configurations which are not
>> created via .internal() in SQLConf anyway.
>>
>> I already tried this automatic generation from the codes at SQL
>> built-in functions and I'm pretty sure we can do the similar thing for
>> configurations as well.
>>
>> We could perhaps mimic what hadoop does
>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>
>> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>>
>>> Some of it is intentionally undocumented, as far as I know, as an
>>> experimental option that may change, or legacy, or safety valve flag.
>>> Certainly anything that's marked an internal conf. (That does raise
>>> the question of who it's for, if you have to read source to find it.)
>>>
>>> I don't know if we need to overhaul the conf system, but there may
>>> indeed be some confs that could legitimately be documented. I don't
>>> know which.
>>>
>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>  wrote:
>>> >
>>> > I filed SPARK-30510 thinking that we had forgotten to document an
>>> option, but it turns out that there's a whole bunch of stuff under
>>> SQLConf.scala that has no public documentation under
>>> http://spark.apache.org/docs.
>>> >
>>> > Would it be appropriate to somehow automatically generate a
>>> documentation page from SQLConf.scala, as Hyukjin suggested on that 
>>> ticket?
>>> >
>>> > Another thought that comes to mind is moving the config
>>> definitions out of Scala and into a data format like YAML or JSON, and 
>>> then
>>> sourcing that both for SQLConf as well as for whatever documentation 
>>> page
>>> we want to generate. What do you think of that idea?
>>> >
>>> > Nick
>>> >
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

 --
 ---
 Takeshi Yamamuro

>>>


Re: More publicly documenting the options under spark.sql.*

2020-02-04 Thread Hyukjin Kwon
FYI, PR was open at https://github.com/apache/spark/pull/27459. Thanks
Nicholas.
Hope guys find some time to take a look.

2020년 1월 28일 (화) 오전 8:15, Nicholas Chammas 님이
작성:

> I am! Thanks for the reference.
>
> On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon  wrote:
>
>> Nicholas, are you interested in taking a stab at this? You could refer
>> https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3
>>
>> 2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro 님이 작성:
>>
>>> The idea looks nice. I think web documents always help end users.
>>>
>>> Bests,
>>> Takeshi
>>>
>>> On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <
>>> shixi...@databricks.com> wrote:
>>>
 "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
 configurations. Should be pretty easy to automatically generate a SQL
 configuration page.

 Best Regards,
 Ryan


 On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon 
 wrote:

> I think automatically creating a configuration page isn't a bad idea
> because I think we deprecate and remove configurations which are not
> created via .internal() in SQLConf anyway.
>
> I already tried this automatic generation from the codes at SQL
> built-in functions and I'm pretty sure we can do the similar thing for
> configurations as well.
>
> We could perhaps mimic what hadoop does
> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>
> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>
>> Some of it is intentionally undocumented, as far as I know, as an
>> experimental option that may change, or legacy, or safety valve flag.
>> Certainly anything that's marked an internal conf. (That does raise
>> the question of who it's for, if you have to read source to find it.)
>>
>> I don't know if we need to overhaul the conf system, but there may
>> indeed be some confs that could legitimately be documented. I don't
>> know which.
>>
>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>  wrote:
>> >
>> > I filed SPARK-30510 thinking that we had forgotten to document an
>> option, but it turns out that there's a whole bunch of stuff under
>> SQLConf.scala that has no public documentation under
>> http://spark.apache.org/docs.
>> >
>> > Would it be appropriate to somehow automatically generate a
>> documentation page from SQLConf.scala, as Hyukjin suggested on that 
>> ticket?
>> >
>> > Another thought that comes to mind is moving the config definitions
>> out of Scala and into a data format like YAML or JSON, and then sourcing
>> that both for SQLConf as well as for whatever documentation page we want 
>> to
>> generate. What do you think of that idea?
>> >
>> > Nick
>> >
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>


Re: More publicly documenting the options under spark.sql.*

2020-01-27 Thread Nicholas Chammas
I am! Thanks for the reference.

On Thu, Jan 16, 2020 at 9:53 PM Hyukjin Kwon  wrote:

> Nicholas, are you interested in taking a stab at this? You could refer
> https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3
>
> 2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro 님이 작성:
>
>> The idea looks nice. I think web documents always help end users.
>>
>> Bests,
>> Takeshi
>>
>> On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <
>> shixi...@databricks.com> wrote:
>>
>>> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
>>> configurations. Should be pretty easy to automatically generate a SQL
>>> configuration page.
>>>
>>> Best Regards,
>>> Ryan
>>>
>>>
>>> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon 
>>> wrote:
>>>
 I think automatically creating a configuration page isn't a bad idea
 because I think we deprecate and remove configurations which are not
 created via .internal() in SQLConf anyway.

 I already tried this automatic generation from the codes at SQL
 built-in functions and I'm pretty sure we can do the similar thing for
 configurations as well.

 We could perhaps mimic what hadoop does
 https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml

 On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:

> Some of it is intentionally undocumented, as far as I know, as an
> experimental option that may change, or legacy, or safety valve flag.
> Certainly anything that's marked an internal conf. (That does raise
> the question of who it's for, if you have to read source to find it.)
>
> I don't know if we need to overhaul the conf system, but there may
> indeed be some confs that could legitimately be documented. I don't
> know which.
>
> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>  wrote:
> >
> > I filed SPARK-30510 thinking that we had forgotten to document an
> option, but it turns out that there's a whole bunch of stuff under
> SQLConf.scala that has no public documentation under
> http://spark.apache.org/docs.
> >
> > Would it be appropriate to somehow automatically generate a
> documentation page from SQLConf.scala, as Hyukjin suggested on that 
> ticket?
> >
> > Another thought that comes to mind is moving the config definitions
> out of Scala and into a data format like YAML or JSON, and then sourcing
> that both for SQLConf as well as for whatever documentation page we want 
> to
> generate. What do you think of that idea?
> >
> > Nick
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>


Re: More publicly documenting the options under spark.sql.*

2020-01-16 Thread Hyukjin Kwon
Each configuration has its documentation already. What we need to do would
be just to list up.

2020년 1월 17일 (금) 오후 12:25, Jules Damji 님이 작성:

> It’s one thing to get the names/values of the configurations, via the
> Spark.sql(“set -v”), but another thing to understand what each achieves and
> when and why you’ll want to use it.
>
> A webpage with a table and description of each is huge benefit.
>
> Cheers
> Jules
>
> Sent from my iPhone
> Pardon the dumb thumb typos :)
>
> On Jan 16, 2020, at 11:04 AM, Shixiong(Ryan) Zhu 
> wrote:
>
> 
> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
> configurations. Should be pretty easy to automatically generate a SQL
> configuration page.
>
> Best Regards,
> Ryan
>
>
> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon  wrote:
>
>> I think automatically creating a configuration page isn't a bad idea
>> because I think we deprecate and remove configurations which are not
>> created via .internal() in SQLConf anyway.
>>
>> I already tried this automatic generation from the codes at SQL built-in
>> functions and I'm pretty sure we can do the similar thing for
>> configurations as well.
>>
>> We could perhaps mimic what hadoop does
>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>
>> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>>
>>> Some of it is intentionally undocumented, as far as I know, as an
>>> experimental option that may change, or legacy, or safety valve flag.
>>> Certainly anything that's marked an internal conf. (That does raise
>>> the question of who it's for, if you have to read source to find it.)
>>>
>>> I don't know if we need to overhaul the conf system, but there may
>>> indeed be some confs that could legitimately be documented. I don't
>>> know which.
>>>
>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>  wrote:
>>> >
>>> > I filed SPARK-30510 thinking that we had forgotten to document an
>>> option, but it turns out that there's a whole bunch of stuff under
>>> SQLConf.scala that has no public documentation under
>>> http://spark.apache.org/docs.
>>> >
>>> > Would it be appropriate to somehow automatically generate a
>>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>> >
>>> > Another thought that comes to mind is moving the config definitions
>>> out of Scala and into a data format like YAML or JSON, and then sourcing
>>> that both for SQLConf as well as for whatever documentation page we want to
>>> generate. What do you think of that idea?
>>> >
>>> > Nick
>>> >
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>


Re: More publicly documenting the options under spark.sql.*

2020-01-16 Thread Jules Damji
It’s one thing to get the names/values of the configurations, via the 
Spark.sql(“set -v”), but another thing to understand what each achieves and 
when and why you’ll want to use it. 

A webpage with a table and description of each is huge benefit. 

Cheers 
Jules 

Sent from my iPhone
Pardon the dumb thumb typos :)

> On Jan 16, 2020, at 11:04 AM, Shixiong(Ryan) Zhu  
> wrote:
> 
> 
> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL 
> configurations. Should be pretty easy to automatically generate a SQL 
> configuration page.
> Best Regards,
> 
> Ryan
> 
> 
>> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon  wrote:
>> I think automatically creating a configuration page isn't a bad idea because 
>> I think we deprecate and remove configurations which are not created via 
>> .internal() in SQLConf anyway.
>> 
>> I already tried this automatic generation from the codes at SQL built-in 
>> functions and I'm pretty sure we can do the similar thing for configurations 
>> as well.
>> 
>> We could perhaps mimic what hadoop does 
>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>> 
>>> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>>> Some of it is intentionally undocumented, as far as I know, as an
>>> experimental option that may change, or legacy, or safety valve flag.
>>> Certainly anything that's marked an internal conf. (That does raise
>>> the question of who it's for, if you have to read source to find it.)
>>> 
>>> I don't know if we need to overhaul the conf system, but there may
>>> indeed be some confs that could legitimately be documented. I don't
>>> know which.
>>> 
>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>  wrote:
>>> >
>>> > I filed SPARK-30510 thinking that we had forgotten to document an option, 
>>> > but it turns out that there's a whole bunch of stuff under SQLConf.scala 
>>> > that has no public documentation under http://spark.apache.org/docs.
>>> >
>>> > Would it be appropriate to somehow automatically generate a documentation 
>>> > page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>> >
>>> > Another thought that comes to mind is moving the config definitions out 
>>> > of Scala and into a data format like YAML or JSON, and then sourcing that 
>>> > both for SQLConf as well as for whatever documentation page we want to 
>>> > generate. What do you think of that idea?
>>> >
>>> > Nick
>>> >
>>> 
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>> 


Re: More publicly documenting the options under spark.sql.*

2020-01-16 Thread Hyukjin Kwon
Nicholas, are you interested in taking a stab at this? You could refer
https://github.com/apache/spark/commit/60472dbfd97acfd6c4420a13f9b32bc9d84219f3

2020년 1월 17일 (금) 오전 8:48, Takeshi Yamamuro 님이 작성:

> The idea looks nice. I think web documents always help end users.
>
> Bests,
> Takeshi
>
> On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wrote:
>
>> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
>> configurations. Should be pretty easy to automatically generate a SQL
>> configuration page.
>>
>> Best Regards,
>> Ryan
>>
>>
>> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon  wrote:
>>
>>> I think automatically creating a configuration page isn't a bad idea
>>> because I think we deprecate and remove configurations which are not
>>> created via .internal() in SQLConf anyway.
>>>
>>> I already tried this automatic generation from the codes at SQL built-in
>>> functions and I'm pretty sure we can do the similar thing for
>>> configurations as well.
>>>
>>> We could perhaps mimic what hadoop does
>>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>>
>>> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>>>
 Some of it is intentionally undocumented, as far as I know, as an
 experimental option that may change, or legacy, or safety valve flag.
 Certainly anything that's marked an internal conf. (That does raise
 the question of who it's for, if you have to read source to find it.)

 I don't know if we need to overhaul the conf system, but there may
 indeed be some confs that could legitimately be documented. I don't
 know which.

 On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
  wrote:
 >
 > I filed SPARK-30510 thinking that we had forgotten to document an
 option, but it turns out that there's a whole bunch of stuff under
 SQLConf.scala that has no public documentation under
 http://spark.apache.org/docs.
 >
 > Would it be appropriate to somehow automatically generate a
 documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
 >
 > Another thought that comes to mind is moving the config definitions
 out of Scala and into a data format like YAML or JSON, and then sourcing
 that both for SQLConf as well as for whatever documentation page we want to
 generate. What do you think of that idea?
 >
 > Nick
 >

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>
> --
> ---
> Takeshi Yamamuro
>


Re: More publicly documenting the options under spark.sql.*

2020-01-16 Thread Takeshi Yamamuro
The idea looks nice. I think web documents always help end users.

Bests,
Takeshi

On Fri, Jan 17, 2020 at 4:04 AM Shixiong(Ryan) Zhu 
wrote:

> "spark.sql("set -v")" returns a Dataset that has all non-internal SQL
> configurations. Should be pretty easy to automatically generate a SQL
> configuration page.
>
> Best Regards,
> Ryan
>
>
> On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon  wrote:
>
>> I think automatically creating a configuration page isn't a bad idea
>> because I think we deprecate and remove configurations which are not
>> created via .internal() in SQLConf anyway.
>>
>> I already tried this automatic generation from the codes at SQL built-in
>> functions and I'm pretty sure we can do the similar thing for
>> configurations as well.
>>
>> We could perhaps mimic what hadoop does
>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>
>> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>>
>>> Some of it is intentionally undocumented, as far as I know, as an
>>> experimental option that may change, or legacy, or safety valve flag.
>>> Certainly anything that's marked an internal conf. (That does raise
>>> the question of who it's for, if you have to read source to find it.)
>>>
>>> I don't know if we need to overhaul the conf system, but there may
>>> indeed be some confs that could legitimately be documented. I don't
>>> know which.
>>>
>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>  wrote:
>>> >
>>> > I filed SPARK-30510 thinking that we had forgotten to document an
>>> option, but it turns out that there's a whole bunch of stuff under
>>> SQLConf.scala that has no public documentation under
>>> http://spark.apache.org/docs.
>>> >
>>> > Would it be appropriate to somehow automatically generate a
>>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>> >
>>> > Another thought that comes to mind is moving the config definitions
>>> out of Scala and into a data format like YAML or JSON, and then sourcing
>>> that both for SQLConf as well as for whatever documentation page we want to
>>> generate. What do you think of that idea?
>>> >
>>> > Nick
>>> >
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

-- 
---
Takeshi Yamamuro


Re: More publicly documenting the options under spark.sql.*

2020-01-16 Thread Shixiong(Ryan) Zhu
"spark.sql("set -v")" returns a Dataset that has all non-internal SQL
configurations. Should be pretty easy to automatically generate a SQL
configuration page.

Best Regards,
Ryan


On Wed, Jan 15, 2020 at 5:47 AM Hyukjin Kwon  wrote:

> I think automatically creating a configuration page isn't a bad idea
> because I think we deprecate and remove configurations which are not
> created via .internal() in SQLConf anyway.
>
> I already tried this automatic generation from the codes at SQL built-in
> functions and I'm pretty sure we can do the similar thing for
> configurations as well.
>
> We could perhaps mimic what hadoop does
> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>
> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>
>> Some of it is intentionally undocumented, as far as I know, as an
>> experimental option that may change, or legacy, or safety valve flag.
>> Certainly anything that's marked an internal conf. (That does raise
>> the question of who it's for, if you have to read source to find it.)
>>
>> I don't know if we need to overhaul the conf system, but there may
>> indeed be some confs that could legitimately be documented. I don't
>> know which.
>>
>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>  wrote:
>> >
>> > I filed SPARK-30510 thinking that we had forgotten to document an
>> option, but it turns out that there's a whole bunch of stuff under
>> SQLConf.scala that has no public documentation under
>> http://spark.apache.org/docs.
>> >
>> > Would it be appropriate to somehow automatically generate a
>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>> >
>> > Another thought that comes to mind is moving the config definitions out
>> of Scala and into a data format like YAML or JSON, and then sourcing that
>> both for SQLConf as well as for whatever documentation page we want to
>> generate. What do you think of that idea?
>> >
>> > Nick
>> >
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: More publicly documenting the options under spark.sql.*

2020-01-16 Thread Felix Cheung
I think it’s a good idea


From: Hyukjin Kwon 
Sent: Wednesday, January 15, 2020 5:49:12 AM
To: dev 
Cc: Sean Owen ; Nicholas Chammas 
Subject: Re: More publicly documenting the options under spark.sql.*

Resending to the dev list for archive purpose:

I think automatically creating a configuration page isn't a bad idea because I 
think we deprecate and remove configurations which are not created via 
.internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in 
functions and I'm pretty sure we can do the similar thing for configurations as 
well.

We could perhaps mimic what hadoop does 
https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml

On Wed, 15 Jan 2020, 22:46 Hyukjin Kwon, 
mailto:gurwls...@gmail.com>> wrote:
I think automatically creating a configuration page isn't a bad idea because I 
think we deprecate and remove configurations which are not created via 
.internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in 
functions and I'm pretty sure we can do the similar thing for configurations as 
well.

We could perhaps mimic what hadoop does 
https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml

On Wed, 15 Jan 2020, 10:46 Sean Owen, 
mailto:sro...@gmail.com>> wrote:
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
mailto:nicholas.cham...@gmail.com>> wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but 
> it turns out that there's a whole bunch of stuff under SQLConf.scala that has 
> no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation 
> page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of 
> Scala and into a data format like YAML or JSON, and then sourcing that both 
> for SQLConf as well as for whatever documentation page we want to generate. 
> What do you think of that idea?
>
> Nick
>

-
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>



Re: More publicly documenting the options under spark.sql.*

2020-01-15 Thread Nicholas Chammas
So do we want to repurpose
SPARK-30510 as an SQL config refactor?

Alternatively, what’s the smallest step forward I can take to publicly
document partitionOverwriteMode (which was my impetus for looking into this
in the first place)?

2020년 1월 15일 (수) 오전 8:49, Hyukjin Kwon 님이 작성:

> Resending to the dev list for archive purpose:
>
> I think automatically creating a configuration page isn't a bad idea
> because I think we deprecate and remove configurations which are not
> created via .internal() in SQLConf anyway.
>
> I already tried this automatic generation from the codes at SQL built-in
> functions and I'm pretty sure we can do the similar thing for
> configurations as well.
>
> We could perhaps mimic what hadoop does
> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>
> On Wed, 15 Jan 2020, 22:46 Hyukjin Kwon,  wrote:
>
>> I think automatically creating a configuration page isn't a bad idea
>> because I think we deprecate and remove configurations which are not
>> created via .internal() in SQLConf anyway.
>>
>> I already tried this automatic generation from the codes at SQL built-in
>> functions and I'm pretty sure we can do the similar thing for
>> configurations as well.
>>
>> We could perhaps mimic what hadoop does
>> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>>
>> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>>
>>> Some of it is intentionally undocumented, as far as I know, as an
>>> experimental option that may change, or legacy, or safety valve flag.
>>> Certainly anything that's marked an internal conf. (That does raise
>>> the question of who it's for, if you have to read source to find it.)
>>>
>>> I don't know if we need to overhaul the conf system, but there may
>>> indeed be some confs that could legitimately be documented. I don't
>>> know which.
>>>
>>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>>  wrote:
>>> >
>>> > I filed SPARK-30510 thinking that we had forgotten to document an
>>> option, but it turns out that there's a whole bunch of stuff under
>>> SQLConf.scala that has no public documentation under
>>> http://spark.apache.org/docs.
>>> >
>>> > Would it be appropriate to somehow automatically generate a
>>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>>> >
>>> > Another thought that comes to mind is moving the config definitions
>>> out of Scala and into a data format like YAML or JSON, and then sourcing
>>> that both for SQLConf as well as for whatever documentation page we want to
>>> generate. What do you think of that idea?
>>> >
>>> > Nick
>>> >
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>


Re: More publicly documenting the options under spark.sql.*

2020-01-15 Thread Hyukjin Kwon
Resending to the dev list for archive purpose:

I think automatically creating a configuration page isn't a bad idea
because I think we deprecate and remove configurations which are not
created via .internal() in SQLConf anyway.

I already tried this automatic generation from the codes at SQL built-in
functions and I'm pretty sure we can do the similar thing for
configurations as well.

We could perhaps mimic what hadoop does
https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml

On Wed, 15 Jan 2020, 22:46 Hyukjin Kwon,  wrote:

> I think automatically creating a configuration page isn't a bad idea
> because I think we deprecate and remove configurations which are not
> created via .internal() in SQLConf anyway.
>
> I already tried this automatic generation from the codes at SQL built-in
> functions and I'm pretty sure we can do the similar thing for
> configurations as well.
>
> We could perhaps mimic what hadoop does
> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml
>
> On Wed, 15 Jan 2020, 10:46 Sean Owen,  wrote:
>
>> Some of it is intentionally undocumented, as far as I know, as an
>> experimental option that may change, or legacy, or safety valve flag.
>> Certainly anything that's marked an internal conf. (That does raise
>> the question of who it's for, if you have to read source to find it.)
>>
>> I don't know if we need to overhaul the conf system, but there may
>> indeed be some confs that could legitimately be documented. I don't
>> know which.
>>
>> On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
>>  wrote:
>> >
>> > I filed SPARK-30510 thinking that we had forgotten to document an
>> option, but it turns out that there's a whole bunch of stuff under
>> SQLConf.scala that has no public documentation under
>> http://spark.apache.org/docs.
>> >
>> > Would it be appropriate to somehow automatically generate a
>> documentation page from SQLConf.scala, as Hyukjin suggested on that ticket?
>> >
>> > Another thought that comes to mind is moving the config definitions out
>> of Scala and into a data format like YAML or JSON, and then sourcing that
>> both for SQLConf as well as for whatever documentation page we want to
>> generate. What do you think of that idea?
>> >
>> > Nick
>> >
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: More publicly documenting the options under spark.sql.*

2020-01-14 Thread Sean Owen
Some of it is intentionally undocumented, as far as I know, as an
experimental option that may change, or legacy, or safety valve flag.
Certainly anything that's marked an internal conf. (That does raise
the question of who it's for, if you have to read source to find it.)

I don't know if we need to overhaul the conf system, but there may
indeed be some confs that could legitimately be documented. I don't
know which.

On Tue, Jan 14, 2020 at 7:32 PM Nicholas Chammas
 wrote:
>
> I filed SPARK-30510 thinking that we had forgotten to document an option, but 
> it turns out that there's a whole bunch of stuff under SQLConf.scala that has 
> no public documentation under http://spark.apache.org/docs.
>
> Would it be appropriate to somehow automatically generate a documentation 
> page from SQLConf.scala, as Hyukjin suggested on that ticket?
>
> Another thought that comes to mind is moving the config definitions out of 
> Scala and into a data format like YAML or JSON, and then sourcing that both 
> for SQLConf as well as for whatever documentation page we want to generate. 
> What do you think of that idea?
>
> Nick
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



More publicly documenting the options under spark.sql.*

2020-01-14 Thread Nicholas Chammas
I filed SPARK-30510
 thinking
that we had forgotten to document an option, but it turns out that there's
a whole bunch of stuff under SQLConf.scala

that has no public documentation under http://spark.apache.org/docs.

Would it be appropriate to somehow automatically generate a documentation
page from SQLConf.scala, as Hyukjin suggested on that ticket?

Another thought that comes to mind is moving the config definitions out of
Scala and into a data format like YAML or JSON, and then sourcing that both
for SQLConf as well as for whatever documentation page we want to generate.
What do you think of that idea?

Nick