from:"Dilip Biswal"

[jira] [Updated] (SPARK-42118) Wrong result when parsing a multiline JSON file with differing types for same column

2023-01-19 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-42118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal updated SPARK-42118:
-
Description: 
Here is a simple reproduction of the problem. We have a JSON file whose content 
looks like following and is in multiLine format.
{code}
[{"name":""},{"name":123.34}]
{code}

Here is the result of spark query when we read the above content.

scala> val df = spark.read.format("json").option("multiLine", 
true).load("/tmp/json")
df: org.apache.spark.sql.DataFrame = [name: double]

scala> df.show(false)
++
|name|
++
|null|
++


scala> df.count
res5: Long = 2

This is quite a serious problem for us as it's causing us to master corrupt 
data in lake. If there is some issue with parsing the input, we expect spark 
set the "_corrupt_record" so that we can act on it. Please note that df.count 
is reporting 2 rows where as df.show only reports 1 row with null value.

  was:
Here is a simple reproduction of the problem. We have a JSON file whose content 
looks like following and is in multiLine format.
[{"name":""},{"name":123.34}]

Here is the result of spark query when we read the above content.

scala> val df = spark.read.format("json").option("multiLine", 
true).load("/tmp/json")
df: org.apache.spark.sql.DataFrame = [name: double]

scala> df.show(false)
++
|name|
++
|null|
++


scala> df.count
res5: Long = 2

This is quite a serious problem for us as it's causing us to master corrupt 
data in lake. If there is some issue with parsing the input, we expect spark 
set the "_corrupt_record" so that we can act on it. Please note that df.count 
is reporting 2 rows where as df.show only reports 1 row with null value.


> Wrong result when parsing a multiline JSON file with differing types for same 
> column
> 
>
> Key: SPARK-42118
> URL: https://issues.apache.org/jira/browse/SPARK-42118
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.1
>Reporter: Dilip Biswal
>Priority: Major
>
> Here is a simple reproduction of the problem. We have a JSON file whose 
> content looks like following and is in multiLine format.
> {code}
> [{"name":""},{"name":123.34}]
> {code}
> Here is the result of spark query when we read the above content.
> scala> val df = spark.read.format("json").option("multiLine", 
> true).load("/tmp/json")
> df: org.apache.spark.sql.DataFrame = [name: double]
> scala> df.show(false)
> ++
> |name|
> ++
> |null|
> ++
> scala> df.count
> res5: Long = 2
> This is quite a serious problem for us as it's causing us to master corrupt 
> data in lake. If there is some issue with parsing the input, we expect spark 
> set the "_corrupt_record" so that we can act on it. Please note that df.count 
> is reporting 2 rows where as df.show only reports 1 row with null value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-42118) Wrong result when parsing a multiline JSON file with differing types for same column

2023-01-19 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-42118:


 Summary: Wrong result when parsing a multiline JSON file with 
differing types for same column
 Key: SPARK-42118
 URL: https://issues.apache.org/jira/browse/SPARK-42118
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.1
Reporter: Dilip Biswal


Here is a simple reproduction of the problem. We have a JSON file whose content 
looks like following and is in multiLine format.
[{"name":""},{"name":123.34}]

Here is the result of spark query when we read the above content.

scala> val df = spark.read.format("json").option("multiLine", 
true).load("/tmp/json")
df: org.apache.spark.sql.DataFrame = [name: double]

scala> df.show(false)
++
|name|
++
|null|
++


scala> df.count
res5: Long = 2

This is quite a serious problem for us as it's causing us to master corrupt 
data in lake. If there is some issue with parsing the input, we expect spark 
set the "_corrupt_record" so that we can act on it. Please note that df.count 
is reporting 2 rows where as df.show only reports 1 row with null value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-38402) Improve user experience when working on data frames created from CSV and JSON in PERMISSIVE mode.

2022-03-07 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-38402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502532#comment-17502532
 ] 

Dilip Biswal commented on SPARK-38402:
--

[~hyukjin.kwon] Thanks !!
Yeah, that should work. The only thing is, this puts an extra burden on the 
application to be aware of the context (i.e accessing the error data frame) and 
do this additional branching. We were wondering if this can be done implicitly 
by the runtime. After all, we are simply trying to do an operation on a data 
frame that is returned to us by spark.

> Improve user experience when working on data frames created from CSV and JSON 
> in PERMISSIVE mode.
> -
>
> Key: SPARK-38402
> URL: https://issues.apache.org/jira/browse/SPARK-38402
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.2.1
>    Reporter: Dilip Biswal
>Priority: Major
>
> In our data processing pipeline, we first process the user supplied data and 
> eliminate invalid/corrupt records. So we parse JSON and CSV files in 
> PERMISSIVE mode where all the invalid records are captured in 
> "_corrupt_record". We then apply predicates on "_corrupt_record" to eliminate 
> the bad records before subjecting the good records further in the processing 
> pipeline.
> We encountered two issues.
> 1. The introduction of "predicate pushdown" for CSV, does not take into 
> account this system generated "_corrupt_column" and tries to push this down 
> to scan resulting in an exception as the column is not part of base schema. 
> 2. Applying predicates on "_corrupt_column" results in a AnalysisException 
> like following.
> {code:java}
> val schema = new StructType()
>   .add("id",IntegerType,true)
>   .add("weight",IntegerType,true) // The weight field is defined wrongly. The 
> actual data contains floating point numbers, while the schema specifies an 
> integer.
>   .add("price",IntegerType,true)
>   .add("_corrupt_record", StringType, true) // The schema contains a special 
> column _corrupt_record, which does not exist in the data. This column 
> captures rows that did not parse correctly.
> val csv_with_wrong_schema = spark.read.format("csv")
>   .option("header", "true")
>   .schema(schema)
>   .load("/FileStore/tables/csv_corrupt_record.csv")
> val badRows = csv_with_wrong_schema.filter($"_corrupt_record".isNotNull)
> 7
> val numBadRows = badRows.count()
>  Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the
> referenced columns only include the internal corrupt record column
> (named _corrupt_record by default). For example:
> spark.read.schema(schema).csv(file).filter($"_corrupt_record".isNotNull).count()
> and spark.read.schema(schema).csv(file).select("_corrupt_record").show().
> Instead, you can cache or save the parsed results and then send the same 
> query.
> For example, val df = spark.read.schema(schema).csv(file).cache() and then
> df.filter($"_corrupt_record".isNotNull).count().
> {code:java}
> For (1), we have disabled predicate pushdown.
> For (2), we currently cache the data frame before using it , however, its not 
> convenient and we would like to see a better user experience.  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-38402) Improve user experience when working on data frames created from CSV and JSON in PERMISSIVE mode.

2022-03-02 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-38402:


 Summary: Improve user experience when working on data frames 
created from CSV and JSON in PERMISSIVE mode.
 Key: SPARK-38402
 URL: https://issues.apache.org/jira/browse/SPARK-38402
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.1
Reporter: Dilip Biswal


In our data processing pipeline, we first process the user supplied data and 
eliminate invalid/corrupt records. So we parse JSON and CSV files in PERMISSIVE 
mode where all the invalid records are captured in "_corrupt_record". We then 
apply predicates on "_corrupt_record" to eliminate the bad records before 
subjecting the good records further in the processing pipeline.

We encountered two issues.
1. The introduction of "predicate pushdown" for CSV, does not take into account 
this system generated "_corrupt_column" and tries to push this down to scan 
resulting in an exception as the column is not part of base schema. 
2. Applying predicates on "_corrupt_column" results in a AnalysisException like 
following.
{code:java}
val schema = new StructType()
  .add("id",IntegerType,true)
  .add("weight",IntegerType,true) // The weight field is defined wrongly. The 
actual data contains floating point numbers, while the schema specifies an 
integer.
  .add("price",IntegerType,true)
  .add("_corrupt_record", StringType, true) // The schema contains a special 
column _corrupt_record, which does not exist in the data. This column captures 
rows that did not parse correctly.

val csv_with_wrong_schema = spark.read.format("csv")
  .option("header", "true")
  .schema(schema)
  .load("/FileStore/tables/csv_corrupt_record.csv")

val badRows = csv_with_wrong_schema.filter($"_corrupt_record".isNotNull)
7
val numBadRows = badRows.count()
 Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the
referenced columns only include the internal corrupt record column
(named _corrupt_record by default). For example:
spark.read.schema(schema).csv(file).filter($"_corrupt_record".isNotNull).count()
and spark.read.schema(schema).csv(file).select("_corrupt_record").show().
Instead, you can cache or save the parsed results and then send the same query.
For example, val df = spark.read.schema(schema).csv(file).cache() and then
df.filter($"_corrupt_record".isNotNull).count().

{code:java}

For (1), we have disabled predicate pushdown.
For (2), we currently cache the data frame before using it , however, its not 
convenient and we would like to see a better user experience.  
 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: Welcoming Russell Spitzer as a new committer

2021-03-29 Thread Dilip Biswal

Congratulations Russel !! Very well deserved, indeed !!

On Mon, Mar 29, 2021 at 9:13 AM Miao Wang  wrote:

> Congratulations Russell!
>
>
>
> Miao
>
>
>
> *From: *Szehon Ho 
> *Reply-To: *"dev@iceberg.apache.org" 
> *Date: *Monday, March 29, 2021 at 9:12 AM
> *To: *"dev@iceberg.apache.org" 
> *Subject: *Re: Welcoming Russell Spitzer as a new committer
>
>
>
> Awesome, well-deserved, Russell!
>
>
>
> Szehon
>
>
>
> On 29 Mar 2021, at 18:10, Holden Karau  wrote:
>
>
>
> Congratulations Russel!
>
>
>
> On Mon, Mar 29, 2021 at 9:10 AM Anton Okolnychyi <
> aokolnyc...@apple.com.invalid> wrote:
>
> Hey folks,
>
> I’d like to welcome Russell Spitzer as a new committer to the project!
>
> Thanks for all your contributions, Russell!
>
> - Anton
>
> --
>
> Twitter: https://twitter.com/holdenkarau
> 
>
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> 
>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> 
>
>
>

Re: Welcoming some new Apache Spark committers

2020-07-16 Thread Dilip Biswal

Thank you all for your kind words. A special "thank you" to *Xiao Li *for
his help and mentorship over the years that helped me immensely. I would
also like to mention *Wenchen Fan*, *Takeshi Yamamuro,* *Sean Owen*, *Dongjoon
hyun*, *Hyukjin Kwon, *and *Liang-Chi Hsieh,* who all helped review the
majority of my PRs allowing me to grow technically.

Thanks again and looking forward to working with you all.

Regards,
Dilip

On Thu, Jul 16, 2020 at 12:53 AM Gengliang Wang <
gengliang.w...@databricks.com> wrote:

> Congratulations!
>
> On Thu, Jul 16, 2020 at 3:17 PM Dr. Kent Yao  wrote:
>
>> Congrats and welcome!!!
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

[jira] [Assigned] (SPARK-31480) Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-07-15 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal reassigned SPARK-31480:


Assignee: Dilip Biswal

> Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node
> ---
>
> Key: SPARK-31480
> URL: https://issues.apache.org/jira/browse/SPARK-31480
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>    Assignee: Dilip Biswal
>Priority: Major
>
> Below is the EXPLAIN OUTPUT when using the *DSV2* 
> *Output of EXPLAIN EXTENDED* 
> {code:java}
> +- BatchScan[col.dots#39L] JsonScan DataFilters: [isnotnull(col.dots#39L), 
> (col.dots#39L = 500)], Location: 
> InMemoryFileIndex[file:/private/var/folders/nr/j6hw4kr51wv0zynvr6srwgr0gp/T/spark-7dad6f63-dc...,
>  PartitionFilters: [], ReadSchema: struct
> {code}
> *Output of EXPLAIN FORMATTED* 
> {code:java}
>  (1) BatchScan
> Output [1]: [col.dots#39L]
> Arguments: [col.dots#39L], 
> JsonScan(org.apache.spark.sql.test.TestSparkSession@45eab322,org.apache.spark.sql.execution.datasources.InMemoryFileIndex@72065f16,StructType(StructField(col.dots,LongType,true)),StructType(StructField(col.dots,LongType,true)),StructType(),org.apache.spark.sql.util.CaseInsensitiveStringMap@8822c5e0,Vector(),List(isnotnull(col.dots#39L),
>  (col.dots#39L = 500)))
> {code}
> When using *DSV1*, the output is much cleaner than the output of DSV2, 
> especially for EXPLAIN FORMATTED.
> *Output of EXPLAIN EXTENDED* 
> {code:java}
> +- FileScan json [col.dots#37L] Batched: false, DataFilters: 
> [isnotnull(col.dots#37L), (col.dots#37L = 500)], Format: JSON, Location: 
> InMemoryFileIndex[file:/private/var/folders/nr/j6hw4kr51wv0zynvr6srwgr0gp/T/spark-89021d76-59...,
>  PartitionFilters: [], PushedFilters: [IsNotNull(`col.dots`), 
> EqualTo(`col.dots`,500)], ReadSchema: struct 
> {code}
> *Output of EXPLAIN FORMATTED* 
> {code:java}
>  (1) Scan json 
> Output [1]: [col.dots#37L]
> Batched: false
> Location: InMemoryFileIndex 
> [file:/private/var/folders/nr/j6hw4kr51wv0zynvr6srwgr0gp/T/spark-89021d76-5971-4a96-bf10-0730873f6ce0]
> PushedFilters: [IsNotNull(`col.dots`), EqualTo(`col.dots`,500)]
> ReadSchema: struct{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-31480) Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node

2020-07-15 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal resolved SPARK-31480.
--
Resolution: Fixed

> Improve the EXPLAIN FORMATTED's output for DSV2's Scan Node
> ---
>
> Key: SPARK-31480
> URL: https://issues.apache.org/jira/browse/SPARK-31480
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>    Assignee: Dilip Biswal
>Priority: Major
>
> Below is the EXPLAIN OUTPUT when using the *DSV2* 
> *Output of EXPLAIN EXTENDED* 
> {code:java}
> +- BatchScan[col.dots#39L] JsonScan DataFilters: [isnotnull(col.dots#39L), 
> (col.dots#39L = 500)], Location: 
> InMemoryFileIndex[file:/private/var/folders/nr/j6hw4kr51wv0zynvr6srwgr0gp/T/spark-7dad6f63-dc...,
>  PartitionFilters: [], ReadSchema: struct
> {code}
> *Output of EXPLAIN FORMATTED* 
> {code:java}
>  (1) BatchScan
> Output [1]: [col.dots#39L]
> Arguments: [col.dots#39L], 
> JsonScan(org.apache.spark.sql.test.TestSparkSession@45eab322,org.apache.spark.sql.execution.datasources.InMemoryFileIndex@72065f16,StructType(StructField(col.dots,LongType,true)),StructType(StructField(col.dots,LongType,true)),StructType(),org.apache.spark.sql.util.CaseInsensitiveStringMap@8822c5e0,Vector(),List(isnotnull(col.dots#39L),
>  (col.dots#39L = 500)))
> {code}
> When using *DSV1*, the output is much cleaner than the output of DSV2, 
> especially for EXPLAIN FORMATTED.
> *Output of EXPLAIN EXTENDED* 
> {code:java}
> +- FileScan json [col.dots#37L] Batched: false, DataFilters: 
> [isnotnull(col.dots#37L), (col.dots#37L = 500)], Format: JSON, Location: 
> InMemoryFileIndex[file:/private/var/folders/nr/j6hw4kr51wv0zynvr6srwgr0gp/T/spark-89021d76-59...,
>  PartitionFilters: [], PushedFilters: [IsNotNull(`col.dots`), 
> EqualTo(`col.dots`,500)], ReadSchema: struct 
> {code}
> *Output of EXPLAIN FORMATTED* 
> {code:java}
>  (1) Scan json 
> Output [1]: [col.dots#37L]
> Batched: false
> Location: InMemoryFileIndex 
> [file:/private/var/folders/nr/j6hw4kr51wv0zynvr6srwgr0gp/T/spark-89021d76-5971-4a96-bf10-0730873f6ce0]
> PushedFilters: [IsNotNull(`col.dots`), EqualTo(`col.dots`,500)]
> ReadSchema: struct{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-32020) Refactor the logic to compute SPARK_HOME into a common place.

2020-06-17 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-32020:


 Summary: Refactor the logic to compute SPARK_HOME into a common 
place.
 Key: SPARK-32020
 URL: https://issues.apache.org/jira/browse/SPARK-32020
 Project: Spark
  Issue Type: Test
  Components: SQL
Affects Versions: 3.0.1
Reporter: Dilip Biswal


Currently several unit tests inline the logic to compute spark home. Secondly 
the error that is thrown when spark home is not set prints the entire 
environment making it difficult to find the actual root cause which is "spark 
home is not set".

1. Refactor the code to compute the spark home in a common place 
(SQLHelper.scala)
2. Change the error message to make it more readable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Re: [vote] Apache Spark 3.0 RC3

2020-06-08 Thread Dilip Biswal

+1 (non-binding)

Regards,
-- Dilip

On Mon, Jun 8, 2020 at 1:03 PM Dongjoon Hyun 
wrote:

> +1
>
> Thanks,
> Dongjoon.
>
> On Mon, Jun 8, 2020 at 6:37 AM Russell Spitzer 
> wrote:
>
>> +1 (non-binding) ran the new SCC DSV2 suite and all other tests, no issues
>>
>> On Sun, Jun 7, 2020 at 11:12 PM Yin Huai  wrote:
>>
>>> Hello everyone,
>>>
>>> I am wondering if it makes more sense to not count Saturday and Sunday.
>>> I doubt that any serious testing work was done during this past weekend.
>>> Can we only count business days in the voting process?
>>>
>>> Thanks,
>>>
>>> Yin
>>>
>>> On Sun, Jun 7, 2020 at 3:24 PM Denny Lee  wrote:
>>>
 +1 (non-binding)

 On Sun, Jun 7, 2020 at 3:21 PM Jungtaek Lim <
 kabhwan.opensou...@gmail.com> wrote:

> I'm seeing the effort of including the correctness issue SPARK-28067
> [1] to 3.0.0 via SPARK-31894 [2]. That doesn't seem to be a regression so
> technically doesn't block the release, so while it'd be good to weigh its
> worth (it requires some SS users to discard the state so might bring less
> frightened requiring it in major version upgrade), it looks to be optional
> to include SPARK-28067 to 3.0.0.
>
> Besides, I see all blockers look to be resolved, thanks all for the
> amazing efforts!
>
> +1 (non-binding) if the decision of SPARK-28067 is "later".
>
> 1. https://issues.apache.org/jira/browse/SPARK-28067
> 2. https://issues.apache.org/jira/browse/SPARK-31894
>
> On Mon, Jun 8, 2020 at 5:23 AM Matei Zaharia 
> wrote:
>
>> +1
>>
>> Matei
>>
>> On Jun 7, 2020, at 6:53 AM, Maxim Gekk 
>> wrote:
>>
>> +1 (non-binding)
>>
>> On Sun, Jun 7, 2020 at 2:34 PM Takeshi Yamamuro <
>> linguin@gmail.com> wrote:
>>
>>> +1 (non-binding)
>>>
>>> I don't see any ongoing PR to fix critical bugs in my area.
>>> Bests,
>>> Takeshi
>>>
>>> On Sun, Jun 7, 2020 at 7:24 PM Mridul Muralidharan 
>>> wrote:
>>>
 +1

 Regards,
 Mridul

 On Sat, Jun 6, 2020 at 1:20 PM Reynold Xin 
 wrote:

> Apologies for the mistake. The vote is open till 11:59pm Pacific
> time on Mon June 9th.
>
> On Sat, Jun 6, 2020 at 1:08 PM Reynold Xin 
> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark
>> version 3.0.0.
>>
>> The vote is open until [DUE DAY] and passes if a majority +1 PMC
>> votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.0.0
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see
>> http://spark.apache.org/
>>
>> The tag to be voted on is v3.0.0-rc3 (commit
>> 3fdfce3120f307147244e5eaf46d61419a723d50):
>> https://github.com/apache/spark/tree/v3.0.0-rc3
>>
>> The release files, including signatures, digests, etc. can be
>> found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.0.0-rc3-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>>
>> https://repository.apache.org/content/repositories/orgapachespark-1350/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.0.0-rc3-docs/
>>
>> The list of bug fixes going into 3.0.0 can be found at the
>> following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12339177
>>
>> This release is using the release script of the tag v3.0.0-rc3.
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by
>> taking
>> an existing Spark workload and running on this release candidate,
>> then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and
>> install
>> the current RC and see if anything important breaks, in the
>> Java/Scala
>> you can add the staging repository to your projects resolvers and
>> test
>> with the RC (make sure to clean up the artifact cache
>> before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.0.0?
>> ===

[jira] [Updated] (SPARK-31875) Provide a option to disable user supplied Hints.

2020-05-31 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-31875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal updated SPARK-31875:
-
Summary: Provide a option to disable user supplied Hints.  (was: Provide a 
option to disabling user supplied Hints.)

> Provide a option to disable user supplied Hints.
> 
>
> Key: SPARK-31875
> URL: https://issues.apache.org/jira/browse/SPARK-31875
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Dilip Biswal
>Priority: Major
>
> Provide a config option similar to Oracle's OPTIMIZER_IGNORE_HINTS. This can 
> be helpful to study the impact of performance difference when hints are 
> applied vs when they are not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31875) Provide a option to disabling user supplied Hints.

2020-05-31 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-31875:


 Summary: Provide a option to disabling user supplied Hints.
 Key: SPARK-31875
 URL: https://issues.apache.org/jira/browse/SPARK-31875
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.0
Reporter: Dilip Biswal


Provide a config option similar to Oracle's OPTIMIZER_IGNORE_HINTS. This can be 
helpful to study the impact of performance difference when hints are applied vs 
when they are not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31673) Enhance QueryExecution.debugFile to take an additional explain mode parameter.

2020-05-10 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-31673:


 Summary: Enhance QueryExecution.debugFile to take an additional 
explain mode parameter.
 Key: SPARK-31673
 URL: https://issues.apache.org/jira/browse/SPARK-31673
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.0
Reporter: Dilip Biswal


Currently debugFile dumps the debugging information for a query in fixed 
format. We can pass in a explain mode parameter to write the debugging 
information in the format requested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30589) Document DISTRIBUTE BY Clause of SELECT statement in SQL Reference.

2020-01-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30589:


 Summary: Document DISTRIBUTE BY Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30589
 URL: https://issues.apache.org/jira/browse/SPARK-30589
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30588) Document CLUSTER BY Clause of SELECT statement in SQL Reference.

2020-01-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30588:


 Summary: Document CLUSTER BY Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30588
 URL: https://issues.apache.org/jira/browse/SPARK-30588
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30583) Document LIMIT Clause of SELECT statement in SQL Reference.

2020-01-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30583:


 Summary:  Document LIMIT Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30583
 URL: https://issues.apache.org/jira/browse/SPARK-30583
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30581) Document SORT BY Clause of SELECT statement in SQL Reference.

2020-01-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30581:


 Summary: Document SORT BY Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30581
 URL: https://issues.apache.org/jira/browse/SPARK-30581
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30579) Document ORDER BY Clause of SELECT statement in SQL Reference.

2020-01-19 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30579:


 Summary: Document ORDER BY Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30579
 URL: https://issues.apache.org/jira/browse/SPARK-30579
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30575) Document HAVING Clause of SELECT statement in SQL Reference.

2020-01-19 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30575:


 Summary: Document HAVING Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30575
 URL: https://issues.apache.org/jira/browse/SPARK-30575
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30574) Document GROUP BY Clause of SELECT statement in SQL Reference.

2020-01-19 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30574:


 Summary: Document GROUP BY Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30574
 URL: https://issues.apache.org/jira/browse/SPARK-30574
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-30573) Document WHERE Clause of SELECT statement in SQL Reference.

2020-01-19 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-30573:


 Summary: Document WHERE Clause of SELECT statement in SQL 
Reference.
 Key: SPARK-30573
 URL: https://issues.apache.org/jira/browse/SPARK-30573
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-29806) Using multiline option for a JSON file which is not multiline results in silent truncation of data.

2019-11-08 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-29806:


 Summary: Using multiline option for a JSON file which is not 
multiline results in silent truncation of data.
 Key: SPARK-29806
 URL: https://issues.apache.org/jira/browse/SPARK-29806
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.4
Reporter: Dilip Biswal


The content of input Json File.
{code:java}
{"name":"John", "id":"100"}
{"name":"Marry","id":"200"}{code}
The above is valid json file but every record is in single line. But trying to 
read this file
 with a multiline option with FAILFAST mode, results in data truncation without 
any error.
{code:java}
scala> spark.read.option("multiLine", true).option("mode", 
"FAILFAST").format("json").load("/tmp/json").show(false)
+---++
|id |name|
+---++
|100|John|
+---++

scala> spark.read.option("mode", 
"FAILFAST").format("json").load("/tmp/json").show(false)
+---+-+
|id |name |
+---+-+
|100|John |
|200|Marry|
+---+-+{code}

I think Spark should return an error in this case especially in FAILFAST mode. 
This can be a common user error and we should not do silent data truncation.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-29563) CREATE TABLE LIKE should look up catalog/table like v2 commands

2019-10-22 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-29563:


 Summary: CREATE TABLE LIKE should look up catalog/table like v2 
commands
 Key: SPARK-29563
 URL: https://issues.apache.org/jira/browse/SPARK-29563
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.0
Reporter: Dilip Biswal
Assignee: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-29458) Document scalar functions usage in APIs in SQL getting started.

2019-10-13 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-29458:


 Summary: Document scalar functions usage in APIs in SQL getting 
started.
 Key: SPARK-29458
 URL: https://issues.apache.org/jira/browse/SPARK-29458
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29458) Document scalar functions usage in APIs in SQL getting started.

2019-10-13 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950727#comment-16950727
 ] 

Dilip Biswal commented on SPARK-29458:
--

I will take a look at this one.

> Document scalar functions usage in APIs in SQL getting started.
> ---
>
> Key: SPARK-29458
> URL: https://issues.apache.org/jira/browse/SPARK-29458
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-29366) Subqueries created for DPP are not printed in EXPLAIN FORMATTED

2019-10-06 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-29366:


 Summary: Subqueries created for DPP are not printed in EXPLAIN 
FORMATTED
 Key: SPARK-29366
 URL: https://issues.apache.org/jira/browse/SPARK-29366
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.4
Reporter: Dilip Biswal


The subquery expressions introduced by DPP are not printed in the newer explain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-30 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941173#comment-16941173
 ] 

Dilip Biswal commented on SPARK-29211:
--

Thanks [~dongjoon]

> Second invocation of custom UDF results in exception (when invoked from shell)
> --
>
> Key: SPARK-29211
> URL: https://issues.apache.org/jira/browse/SPARK-29211
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.4, 2.4.4, 3.0.0
>Reporter: Dilip Biswal
>Priority: Major
>
> I encountered this while writing documentation for SQL reference. Here is the 
> small repro:
> UDF:
>  =
> {code:java}
> import org.apache.hadoop.hive.ql.exec.UDF;
>   
> public class SimpleUdf extends UDF {
>   public int evaluate(int value) {
> return value + 10;
>   }
> }
> {code}
> {code:java}
> spark.sql("CREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR 
> '/tmp/SimpleUdf.jar'").show
> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
> +-+   
>   
> |function_return_value|
> +-+
> |   11|
> |   12|
> +-+
> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
> scala> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM 
> t1").show
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name 
> hive.internal.ss.authz.settings.applied.marker does not exist
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout 
> does not exist
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait 
> does not exist
> org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 
> 'SimpleUdf': java.lang.ClassNotFoundException: SimpleUdf; line 1 pos 7
>   at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:245)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:57)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:57)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:61)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:60)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:78)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:78)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$makeFunctionExpression$2(HiveSessionCatalog.scala:78)
>   at scala.util.Failure.getOrElse(Try.scala:222)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:70)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.$anonfun$makeFunctionBuilder$1(SessionCatalog.scala:1176)
>   at 
> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:121)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1344)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.super$lookupFunction(HiveSessionCatalog.scala:132)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$lookupFunction0$2(HiveSessionCatalog.scala:132)
> {code}
> Please note that the problem does not happen if we try it from a testsuite. 
> So far i have only seen it when i try it from shell. Also i tried it in 2.4.4 
> and observe the same behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-24 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936529#comment-16936529
 ] 

Dilip Biswal commented on SPARK-29211:
--

[~dongjoon] I tried with 2.3.4 and see the same issue. In the spark download 
site, it only has two versions. I am unable to go back below 2.3.4. 

> Second invocation of custom UDF results in exception (when invoked from shell)
> --
>
> Key: SPARK-29211
> URL: https://issues.apache.org/jira/browse/SPARK-29211
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4, 3.0.0
>Reporter: Dilip Biswal
>Priority: Major
>
> I encountered this while writing documentation for SQL reference. Here is the 
> small repro:
> UDF:
>  =
> {code:java}
> import org.apache.hadoop.hive.ql.exec.UDF;
>   
> public class SimpleUdf extends UDF {
>   public int evaluate(int value) {
> return value + 10;
>   }
> }
> {code}
> {code:java}
> spark.sql("CREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR 
> '/tmp/SimpleUdf.jar'").show
> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
> +-+   
>   
> |function_return_value|
> +-+
> |   11|
> |   12|
> +-+
> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
> scala> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM 
> t1").show
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name 
> hive.internal.ss.authz.settings.applied.marker does not exist
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout 
> does not exist
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait 
> does not exist
> org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 
> 'SimpleUdf': java.lang.ClassNotFoundException: SimpleUdf; line 1 pos 7
>   at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:245)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:57)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:57)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:61)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:60)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:78)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:78)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$makeFunctionExpression$2(HiveSessionCatalog.scala:78)
>   at scala.util.Failure.getOrElse(Try.scala:222)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:70)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.$anonfun$makeFunctionBuilder$1(SessionCatalog.scala:1176)
>   at 
> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:121)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1344)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.super$lookupFunction(HiveSessionCatalog.scala:132)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$lookupFunction0$2(HiveSessionCatalog.scala:132)
> {code}
> Please note that the problem does not happen if we try it from a testsuite. 
> So far i have only seen it when i try it from shell. Also i tried it in 2.4.4 
> and observe the same behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935608#comment-16935608
 ] 

Dilip Biswal commented on SPARK-29211:
--

cc [~smilegator] [~dongjoon]

> Second invocation of custom UDF results in exception (when invoked from shell)
> --
>
> Key: SPARK-29211
> URL: https://issues.apache.org/jira/browse/SPARK-29211
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: Dilip Biswal
>Priority: Major
>
> I encountered this while writing documentation for SQL reference. Here is the 
> small repro:
> UDF:
>  =
> {code:java}
> import org.apache.hadoop.hive.ql.exec.UDF;
>   
> public class SimpleUdf extends UDF {
>   public int evaluate(int value) {
> return value + 10;
>   }
> }
> {code}
> {code:java}
> spark.sql("CREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR 
> '/tmp/SimpleUdf.jar'").show
> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
> +-+   
>   
> |function_return_value|
> +-+
> |   11|
> |   12|
> +-+
> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
> scala> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM 
> t1").show
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name 
> hive.internal.ss.authz.settings.applied.marker does not exist
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout 
> does not exist
> 19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait 
> does not exist
> org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 
> 'SimpleUdf': java.lang.ClassNotFoundException: SimpleUdf; line 1 pos 7
>   at 
> scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:245)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:57)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:57)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:61)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:60)
>   at 
> org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:78)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:78)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$makeFunctionExpression$2(HiveSessionCatalog.scala:78)
>   at scala.util.Failure.getOrElse(Try.scala:222)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:70)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.$anonfun$makeFunctionBuilder$1(SessionCatalog.scala:1176)
>   at 
> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:121)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1344)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.super$lookupFunction(HiveSessionCatalog.scala:132)
>   at 
> org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$lookupFunction0$2(HiveSessionCatalog.scala:132)
> {code}
> Please note that the problem does not happen if we try it from a testsuite. 
> So far i have only seen it when i try it from shell. Also i tried it in 2.4.4 
> and observe the same behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-29211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal updated SPARK-29211:
-
Description: 
I encountered this while writing documentation for SQL reference. Here is the 
small repro:

UDF:
 =
{code:java}
import org.apache.hadoop.hive.ql.exec.UDF;
  
public class SimpleUdf extends UDF {
  public int evaluate(int value) {
return value + 10;
  }
}
{code}
{code:java}
spark.sql("CREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR 
'/tmp/SimpleUdf.jar'").show
spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
+-+ 
|function_return_value|
+-+
|   11|
|   12|
+-+
spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
scala> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
19/09/23 00:43:18 WARN HiveConf: HiveConf of name 
hive.internal.ss.authz.settings.applied.marker does not exist
19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does 
not exist
19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait does 
not exist
org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 
'SimpleUdf': java.lang.ClassNotFoundException: SimpleUdf; line 1 pos 7
  at 
scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at 
org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:245)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:57)
  at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:57)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:61)
  at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:60)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:78)
  at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:78)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$makeFunctionExpression$2(HiveSessionCatalog.scala:78)
  at scala.util.Failure.getOrElse(Try.scala:222)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:70)
  at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.$anonfun$makeFunctionBuilder$1(SessionCatalog.scala:1176)
  at 
org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:121)
  at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1344)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.super$lookupFunction(HiveSessionCatalog.scala:132)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$lookupFunction0$2(HiveSessionCatalog.scala:132)
{code}
Please note that the problem does not happen if we try it from a testsuite. So 
far i have only seen it when i try it from shell. Also i tried it in 2.4.4 and 
observe the same behaviour.

  was:
I encountered this while writing documentation for SQL reference. Here is the 
small repro:

UDF:
=
{code}
import org.apache.hadoop.hive.ql.exec.UDF;
  
public class SimpleUdf extends UDF {
  public int evaluate(int value) {
return value + 10;
  }
}
{code}

{code}
spark.sql("CREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR 
'/tmp/SimpleUdf.jar'").show
spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
+-+ 
|function_return_value|
+-+
|   11|
|   12|
+-+
spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
scala> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
19/09/23 00:43:18 WARN HiveConf: HiveConf of name 
hive.internal.ss.authz.settings.applied.marker does not exist
19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does 
not exist
19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait does 
not exist
org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 
'SimpleUdf': java.lang.ClassNotFoundException: SimpleUdf; line 1 pos 7
  at 
scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at 
org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:245)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:57)
  at org.apache.spark.sql.hive.HiveSimp

[jira] [Created] (SPARK-29211) Second invocation of custom UDF results in exception (when invoked from shell)

2019-09-23 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-29211:


 Summary: Second invocation of custom UDF results in exception 
(when invoked from shell)
 Key: SPARK-29211
 URL: https://issues.apache.org/jira/browse/SPARK-29211
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.4
Reporter: Dilip Biswal


I encountered this while writing documentation for SQL reference. Here is the 
small repro:

UDF:
=
{code}
import org.apache.hadoop.hive.ql.exec.UDF;
  
public class SimpleUdf extends UDF {
  public int evaluate(int value) {
return value + 10;
  }
}
{code}

{code}
spark.sql("CREATE FUNCTION simple_udf AS 'SimpleUdf' USING JAR 
'/tmp/SimpleUdf.jar'").show
spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
+-+ 
|function_return_value|
+-+
|   11|
|   12|
+-+
spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
scala> spark.sql("SELECT simple_udf(c1) AS function_return_value FROM t1").show
19/09/23 00:43:18 WARN HiveConf: HiveConf of name 
hive.internal.ss.authz.settings.applied.marker does not exist
19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does 
not exist
19/09/23 00:43:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait does 
not exist
org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 
'SimpleUdf': java.lang.ClassNotFoundException: SimpleUdf; line 1 pos 7
  at 
scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at 
org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:245)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:57)
  at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:57)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:61)
  at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:60)
  at 
org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:78)
  at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:78)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$makeFunctionExpression$2(HiveSessionCatalog.scala:78)
  at scala.util.Failure.getOrElse(Try.scala:222)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:70)
  at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.$anonfun$makeFunctionBuilder$1(SessionCatalog.scala:1176)
  at 
org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:121)
  at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1344)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.super$lookupFunction(HiveSessionCatalog.scala:132)
  at 
org.apache.spark.sql.hive.HiveSessionCatalog.$anonfun$lookupFunction0$2(HiveSessionCatalog.scala:132)
{code}

Please note that the problem does not happen if we try it from a testsuite. So 
far i have only seen it when i try it from shell.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-28793) Document CREATE FUNCTION in SQL Reference.

2019-09-18 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932140#comment-16932140
 ] 

Dilip Biswal edited comment on SPARK-28793 at 9/18/19 7:01 AM:
---

[~sandeep.katta2007] hey... actually i have mostly written this up. Let me 
check tomorrow what state it is in. If there is too much that is left i will 
let you know. 


was (Author: dkbiswal):
[~sandeep.katta2007] hey... actually i have mostly written up this. Let me 
check tomorrow what state it is in. If there is too much that is left i will 
let you know. 

> Document CREATE FUNCTION in SQL Reference.
> --
>
> Key: SPARK-28793
> URL: https://issues.apache.org/jira/browse/SPARK-28793
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28793) Document CREATE FUNCTION in SQL Reference.

2019-09-18 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932140#comment-16932140
 ] 

Dilip Biswal commented on SPARK-28793:
--

[~sandeep.katta2007] hey... actually i have mostly written up this. Let me 
check tomorrow what state it is in. If there is too much that is left i will 
let you know. 

> Document CREATE FUNCTION in SQL Reference.
> --
>
> Key: SPARK-28793
> URL: https://issues.apache.org/jira/browse/SPARK-28793
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29092) EXPLAIN FORMATTED does not work well with DPP

2019-09-15 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930237#comment-16930237
 ] 

Dilip Biswal commented on SPARK-29092:
--

I am looking into this.

> EXPLAIN FORMATTED does not work well with DPP
> -
>
> Key: SPARK-29092
> URL: https://issues.apache.org/jira/browse/SPARK-29092
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
>  
> {code:java}
> withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_ENABLED.key -> "true",
>   SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST.key -> "false") {
>   withTable("df1", "df2") {
> spark.range(1000)
>   .select(col("id"), col("id").as("k"))
>   .write
>   .partitionBy("k")
>   .format(tableFormat)
>   .mode("overwrite")
>   .saveAsTable("df1")
> spark.range(100)
>   .select(col("id"), col("id").as("k"))
>   .write
>   .partitionBy("k")
>   .format(tableFormat)
>   .mode("overwrite")
>   .saveAsTable("df2")
> sql("EXPLAIN FORMATTED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
> df2.k AND df2.id < 2")
>   .show(false)
> sql("EXPLAIN EXTENDED SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k = 
> df2.k AND df2.id < 2")
>   .show(false)
>   }
> }
> {code}
> The output of EXPLAIN EXTENDED is expected.
> {code:java}
> == Physical Plan ==
> *(2) Project [id#2721L, k#2724L]
> +- *(2) BroadcastHashJoin [k#2722L], [k#2724L], Inner, BuildRight
>:- *(2) ColumnarToRow
>:  +- FileScan parquet default.df1[id#2721L,k#2722L] Batched: true, 
> DataFilters: [], Format: Parquet, Location: 
> PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
>  PartitionFilters: [isnotnull(k#2722L), dynamicpruningexpression(k#2722L IN 
> subquery2741)], PushedFilters: [], ReadSchema: struct
>:+- Subquery subquery2741, [id=#358]
>:   +- *(2) HashAggregate(keys=[k#2724L], functions=[], 
> output=[k#2724L#2740L])
>:  +- Exchange hashpartitioning(k#2724L, 5), true, [id=#354]
>: +- *(1) HashAggregate(keys=[k#2724L], functions=[], 
> output=[k#2724L])
>:+- *(1) Project [k#2724L]
>:   +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L 
> < 2))
>:  +- *(1) ColumnarToRow
>: +- FileScan parquet 
> default.df2[id#2723L,k#2724L] Batched: true, DataFilters: 
> [isnotnull(id#2723L), (id#2723L < 2)], Format: Parquet, Location: 
> PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
>  PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
> LessThan(id,2)], ReadSchema: struct
>+- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, 
> true])), [id=#379]
>   +- *(1) Project [k#2724L]
>  +- *(1) Filter (isnotnull(id#2723L) AND (id#2723L < 2))
> +- *(1) ColumnarToRow
>+- FileScan parquet default.df2[id#2723L,k#2724L] Batched: 
> true, DataFilters: [isnotnull(id#2723L), (id#2723L < 2)], Format: Parquet, 
> Location: 
> PrunedInMemoryFileIndex[file:/Users/lixiao/IdeaProjects/spark/sql/core/spark-warehouse/org.apache...,
>  PartitionFilters: [isnotnull(k#2724L)], PushedFilters: [IsNotNull(id), 
> LessThan(id,2)], ReadSchema: struct
> {code}
> However, the output of FileScan node of EXPLAIN FORMATTED does not show the 
> effect of DPP
> {code:java}
> * Project (9)
> +- * BroadcastHashJoin Inner BuildRight (8)
>:- * ColumnarToRow (2)
>:  +- Scan parquet default.df1 (1)
>+- BroadcastExchange (7)
>   +- * Project (6)
>  +- * Filter (5)
> +- * ColumnarToRow (4)
>+- Scan parquet default.df2 (3)
> (1) Scan parquet default.df1 
> Output: [id#2716L, k#2717L]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-28588) Build a SQL reference doc

2019-09-12 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-28588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal updated SPARK-28588:
-
Description: 
Spark SQL requires a SQL reference doc for end users, like all the other query 
engines. [https://www.postgresql.org/docs/11/sql.html] is an example. 

Please follow the format of PRs that were merged. Below are couple of PRs that 
can

be used as a guideline.

[PR1|https://github.com/apache/spark/pull/25529]
[PR2|https://github.com/apache/spark/pull/25525]

 

  was:Spark SQL requires a SQL reference doc for end users, like all the other 
query engines. [https://www.postgresql.org/docs/11/sql.html] is an example. 


> Build a SQL reference doc
> -
>
> Key: SPARK-28588
> URL: https://issues.apache.org/jira/browse/SPARK-28588
> Project: Spark
>  Issue Type: Umbrella
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> Spark SQL requires a SQL reference doc for end users, like all the other 
> query engines. [https://www.postgresql.org/docs/11/sql.html] is an example. 
> Please follow the format of PRs that were merged. Below are couple of PRs 
> that can
> be used as a guideline.
> [PR1|https://github.com/apache/spark/pull/25529]
> [PR2|https://github.com/apache/spark/pull/25525]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927270#comment-16927270
 ] 

Dilip Biswal commented on SPARK-29038:
--

[~jerryshao] [~smilegator] Thanks. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927258#comment-16927258
 ] 

Dilip Biswal edited comment on SPARK-29038 at 9/11/19 5:13 AM:
---

[~cltlfcjin]

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 and have recently documented
{code:java}
 CACHE TABLE {code}
in [https://github.com/apache/spark/pull/25532].  So in SPARK, it is
 possible to cache the result of a complex query involving joins, aggregates 
etc, right ?


was (Author: dkbiswal):
[~cltlfcjin]

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 and have recently documented
{code:java}
 CACHE TABLE {code}
in [https://github.com/apache/spark/pull/25532].  So in SPARK, it is
 possible to cache the result of a complex query involving joins, aggregates 
etc. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927258#comment-16927258
 ] 

Dilip Biswal edited comment on SPARK-29038 at 9/11/19 5:09 AM:
---

[~cltlfcjin]

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 and have recently documented
{code:java}
 CACHE TABLE {code}
in [https://github.com/apache/spark/pull/25532].  So in SPARK, it is
 possible to cache the result of a complex query involving joins, aggregates 
etc. 


was (Author: dkbiswal):
[~cltlfcjin] 

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 and have recently documented {code} CACHE TABLE {code}  in 
[https://github.com/apache/spark/pull/25532].  So in SPARK, it is
possible to cache the result of a complex query involving joins, aggregates 
etc. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927258#comment-16927258
 ] 

Dilip Biswal edited comment on SPARK-29038 at 9/11/19 5:09 AM:
---

[~cltlfcjin] 

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 and have recently documented {code} CACHE TABLE {code}  in 
[https://github.com/apache/spark/pull/25532].  So in SPARK, it is
possible to cache the result of a complex query involving joins, aggregates 
etc. 


was (Author: dkbiswal):
[~cltlfcjin] 

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 have recently

documented {code} CACHE TABLE {code}  in 
[https://github.com/apache/spark/pull/25532].  So in SPARK, it is
possible to cache the result of a complex query involving joins, aggregates 
etc. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927258#comment-16927258
 ] 

Dilip Biswal commented on SPARK-29038:
--

[~cltlfcjin] 

Actually i had similar question as [~mgaido]. We have been writing the SQL 
reference for 3.0 have recently

documented {code} CACHE TABLE {code}  in 
[https://github.com/apache/spark/pull/25532].  So in SPARK, it is
possible to cache the result of a complex query involving joins, aggregates 
etc. 

> SPIP: Support Spark Materialized View
> -
>
> Key: SPARK-29038
> URL: https://issues.apache.org/jira/browse/SPARK-29038
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Lantao Jin
>Priority: Major
>
> Materialized view is an important approach in DBMS to cache data to 
> accelerate queries. By creating a materialized view through SQL, the data 
> that can be cached is very flexible, and needs to be configured arbitrarily 
> according to specific usage scenarios. The Materialization Manager 
> automatically updates the cache data according to changes in detail source 
> tables, simplifying user work. When user submit query, Spark optimizer 
> rewrites the execution plan based on the available materialized view to 
> determine the optimal execution plan.
> Details in [design 
> doc|https://docs.google.com/document/d/1q5pjSWoTNVc9zsAfbNzJ-guHyVwPsEroIEP8Cca179A/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

RE: Welcoming some new committers and PMC members

2019-09-10 Thread Dilip Biswal

Congratulations !! Very well deserved.
 
-- Dilip
 
- Original message -From: "Kazuaki Ishizaki" To: Matei Zaharia Cc: dev Subject: [EXTERNAL] Re: Welcoming some new committers and PMC membersDate: Mon, Sep 9, 2019 9:25 PM Congrats! Well deserved.Kazuaki Ishizaki,From:        Matei Zaharia To:        dev Date:        2019/09/10 09:32Subject:        [EXTERNAL] Welcoming some new committers and PMC members
Hi all,The Spark PMC recently voted to add several new committers and one PMC member. Join me in welcoming them to their new roles!New PMC member: Dongjoon HyunNew committers: Ryan Blue, Liang-Chi Hsieh, Gengliang Wang, Yuming Wang, Weichen Xu, Ruifeng ZhengThe new committers cover lots of important areas including ML, SQL, and data sources, so it’s great to have them here. All the best,Matei and the Spark PMC-To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
 


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

[jira] [Created] (SPARK-29028) Add links to IBM Cloud Object Storage connector in cloud-integration.md

2019-09-09 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-29028:


 Summary: Add links to IBM Cloud Object Storage connector in 
cloud-integration.md
 Key: SPARK-29028
 URL: https://issues.apache.org/jira/browse/SPARK-29028
 Project: Spark
  Issue Type: Documentation
  Components: Documentation
Affects Versions: 2.4.4
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-28965) Document workings of CBO

2019-09-03 Thread Dilip Biswal (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-28965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dilip Biswal updated SPARK-28965:
-
Summary: Document workings of CBO  (was: Document workings for CBO)

> Document workings of CBO
> 
>
> Key: SPARK-28965
> URL: https://issues.apache.org/jira/browse/SPARK-28965
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28965) Document workings for CBO

2019-09-03 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921745#comment-16921745
 ] 

Dilip Biswal commented on SPARK-28965:
--

cc [~smilegator]

> Document workings for CBO
> -
>
> Key: SPARK-28965
> URL: https://issues.apache.org/jira/browse/SPARK-28965
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28965) Document workings for CBO

2019-09-03 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28965:


 Summary: Document workings for CBO
 Key: SPARK-28965
 URL: https://issues.apache.org/jira/browse/SPARK-28965
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28832) Document SHOW SCHEMAS statement in SQL Reference.

2019-08-22 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913560#comment-16913560
 ] 

Dilip Biswal commented on SPARK-28832:
--

[~jobitmathew] Thanks .. Yeah it will be documented as part of SHOW DATABASES 
Please review  [https://github.com/apache/spark/pull/25526] and let me know if 
you want anything changed. 

> Document SHOW SCHEMAS statement in SQL Reference.
> -
>
> Key: SPARK-28832
> URL: https://issues.apache.org/jira/browse/SPARK-28832
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: jobit mathew
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-28819) Document CREATE OR REPLACE FUNCTION in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912976#comment-16912976
 ] 

Dilip Biswal edited comment on SPARK-28819 at 8/22/19 5:07 AM:
---

[~abhishek.akg]

Hmmn... There is one rule in our grammar file for CREATE FUNCTION ..
{code:java}
CREATE (OR REPLACE)? TEMPORARY? FUNCTION (IF NOT EXISTS)?
 qualifiedName AS className=STRING
 (USING resource (',' resource)*)? #createFunction
{code}
So not sure why we should separate this in the doc. My guess is when the 
databricks docs were
 written .. REPLACE option was not there.. and it has not been updated to 
reflect this addition. The

reason i say that it should be covered together is because there is a lot of 
boilerplate stuff that will

need to be duplicated.

cc [~smilegator] for his input.


was (Author: dkbiswal):
[~abhishek.akg]

Hmmn... There is one rule in our grammar file for CREATE FUNCTION ..
{code:java}
CREATE (OR REPLACE)? TEMPORARY? FUNCTION (IF NOT EXISTS)?
 qualifiedName AS className=STRING
 (USING resource (',' resource)*)? #createFunction
{code}
So not sure why we should separate in the doc. My guess is when the databricks 
docs were
 written .. REPLACE option was not there.. and it has not been updated to 
reflect this addition. The

reason i say that it should be covered together is because there is a lot of 
boilerplate stuff that will

need to be duplicated.

cc [~smilegator] for his input.

> Document CREATE OR REPLACE FUNCTION in SQL Reference
> 
>
> Key: SPARK-28819
> URL: https://issues.apache.org/jira/browse/SPARK-28819
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28819) Document CREATE OR REPLACE FUNCTION in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912976#comment-16912976
 ] 

Dilip Biswal commented on SPARK-28819:
--

[~abhishek.akg]

Hmmn... There is one rule in our grammar file for CREATE FUNCTION ..
{code:java}
CREATE (OR REPLACE)? TEMPORARY? FUNCTION (IF NOT EXISTS)?
 qualifiedName AS className=STRING
 (USING resource (',' resource)*)? #createFunction
{code}
So not sure why we should separate in the doc. My guess is when the databricks 
docs were
 written .. REPLACE option was not there.. and it has not been updated to 
reflect this addition. The

reason i say that it should be covered together is because there is a lot of 
boilerplate stuff that will

need to be duplicated.

cc [~smilegator] for his input.

> Document CREATE OR REPLACE FUNCTION in SQL Reference
> 
>
> Key: SPARK-28819
> URL: https://issues.apache.org/jira/browse/SPARK-28819
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28829) Document SET ROLE ADMIN in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912447#comment-16912447
 ] 

Dilip Biswal commented on SPARK-28829:
--

[~abhishek.akg] Is the statement supported in Spark ? I don't see a 
implementation for this ?

> Document SET ROLE ADMIN in SQL Reference
> 
>
> Key: SPARK-28829
> URL: https://issues.apache.org/jira/browse/SPARK-28829
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28825) Document EXPLAIN Statement in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912368#comment-16912368
 ] 

Dilip Biswal commented on SPARK-28825:
--

I would like to take this one up.

> Document EXPLAIN Statement in SQL Reference.
> 
>
> Key: SPARK-28825
> URL: https://issues.apache.org/jira/browse/SPARK-28825
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: jobit mathew
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28820) Document SHOW FUNCTION LIKE SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912366#comment-16912366
 ] 

Dilip Biswal commented on SPARK-28820:
--

[~abhishek.akg] Isn't this same as 
https://issues.apache.org/jira/browse/SPARK-28808 ?

> Document SHOW FUNCTION LIKE SQL Reference
> -
>
> Key: SPARK-28820
> URL: https://issues.apache.org/jira/browse/SPARK-28820
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28821) Document COMPUTE STAT in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912358#comment-16912358
 ] 

Dilip Biswal commented on SPARK-28821:
--

[~abhishek.akg] Shouldn't this have been covered by Analyze table 
https://issues.apache.org/jira/browse/SPARK-28788 ? 

> Document COMPUTE STAT in SQL Reference
> --
>
> Key: SPARK-28821
> URL: https://issues.apache.org/jira/browse/SPARK-28821
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28832) Document SHOW SCHEMAS statement in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912355#comment-16912355
 ] 

Dilip Biswal commented on SPARK-28832:
--

[~jobitmathew] This has been already documented as part of SHOW DATABASES ?

> Document SHOW SCHEMAS statement in SQL Reference.
> -
>
> Key: SPARK-28832
> URL: https://issues.apache.org/jira/browse/SPARK-28832
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: jobit mathew
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28826) Document DROP TEMPORARY FUNCTION in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912350#comment-16912350
 ] 

Dilip Biswal commented on SPARK-28826:
--

[~abhishek.akg] Hello, i think this can be documented as part of DROP FUNCTION 

https://issues.apache.org/jira/browse/SPARK-28797 ?

> Document DROP TEMPORARY FUNCTION in SQL Reference
> -
>
> Key: SPARK-28826
> URL: https://issues.apache.org/jira/browse/SPARK-28826
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28824) Document CREATE TEMPORARY FUNCTION in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912348#comment-16912348
 ] 

Dilip Biswal commented on SPARK-28824:
--

[~abhishek.akg] I think this one can be documented along with create function 
https://issues.apache.org/jira/browse/SPARK-28793 ?

> Document CREATE TEMPORARY FUNCTION in SQL Reference
> ---
>
> Key: SPARK-28824
> URL: https://issues.apache.org/jira/browse/SPARK-28824
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28819) Document CREATE OR REPLACE FUNCTION in SQL Reference

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912344#comment-16912344
 ] 

Dilip Biswal commented on SPARK-28819:
--

[~shivuson...@gmail.com] I had created SPARK-28793 and started to work on it 
and this one is a duplicate. Can we please close this one.

> Document CREATE OR REPLACE FUNCTION in SQL Reference
> 
>
> Key: SPARK-28819
> URL: https://issues.apache.org/jira/browse/SPARK-28819
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28811) Document SHOW TBLPROPERTIES in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912339#comment-16912339
 ] 

Dilip Biswal commented on SPARK-28811:
--

I would take this one up.

> Document SHOW TBLPROPERTIES in SQL Reference.
> -
>
> Key: SPARK-28811
> URL: https://issues.apache.org/jira/browse/SPARK-28811
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28812) Document SHOW PARTITIONS in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912340#comment-16912340
 ] 

Dilip Biswal commented on SPARK-28812:
--

I would take this one up.

> Document SHOW PARTITIONS in SQL Reference.
> --
>
> Key: SPARK-28812
> URL: https://issues.apache.org/jira/browse/SPARK-28812
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28793) Document CREATE FUNCTION in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912338#comment-16912338
 ] 

Dilip Biswal commented on SPARK-28793:
--

I will take a look at this one.

> Document CREATE FUNCTION in SQL Reference.
> --
>
> Key: SPARK-28793
> URL: https://issues.apache.org/jira/browse/SPARK-28793
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28808) Document SHOW FUNCTIONS in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912333#comment-16912333
 ] 

Dilip Biswal commented on SPARK-28808:
--

[~sharangk] Hello.. very sorry.. i had started to work on this last night.. 
Could you please help review this PR and instead of this pick another one ? 
Thanks a lot.

> Document SHOW FUNCTIONS in SQL Reference.
> -
>
> Key: SPARK-28808
> URL: https://issues.apache.org/jira/browse/SPARK-28808
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-28801) Document SELECT statement in SQL Reference.

2019-08-21 Thread Dilip Biswal (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-28801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912325#comment-16912325
 ] 

Dilip Biswal commented on SPARK-28801:
--

I am looking into this one.

> Document SELECT statement in SQL Reference.
> ---
>
> Key: SPARK-28801
> URL: https://issues.apache.org/jira/browse/SPARK-28801
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, SQL
>Affects Versions: 2.4.3
>Reporter: Dilip Biswal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28816) Document ADD JAR statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28816:


 Summary: Document ADD JAR statement in SQL Reference.
 Key: SPARK-28816
 URL: https://issues.apache.org/jira/browse/SPARK-28816
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28815) Document ADD FILE statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28815:


 Summary: Document ADD FILE statement in SQL Reference.
 Key: SPARK-28815
 URL: https://issues.apache.org/jira/browse/SPARK-28815
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28814) Document SET/RESET in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28814:


 Summary: Document SET/RESET in SQL Reference.
 Key: SPARK-28814
 URL: https://issues.apache.org/jira/browse/SPARK-28814
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28812) Document SHOW PARTITIONS in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28812:


 Summary: Document SHOW PARTITIONS in SQL Reference.
 Key: SPARK-28812
 URL: https://issues.apache.org/jira/browse/SPARK-28812
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28813) Document SHOW CREATE TABLE in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28813:


 Summary: Document SHOW CREATE TABLE in SQL Reference.
 Key: SPARK-28813
 URL: https://issues.apache.org/jira/browse/SPARK-28813
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28811) Document SHOW TBLPROPERTIES in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28811:


 Summary: Document SHOW TBLPROPERTIES in SQL Reference.
 Key: SPARK-28811
 URL: https://issues.apache.org/jira/browse/SPARK-28811
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28809) Document SHOW TABLE in SQL Reference

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28809:


 Summary: Document SHOW TABLE in SQL Reference
 Key: SPARK-28809
 URL: https://issues.apache.org/jira/browse/SPARK-28809
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28810) Document SHOW TABLES in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28810:


 Summary: Document SHOW TABLES in SQL Reference.
 Key: SPARK-28810
 URL: https://issues.apache.org/jira/browse/SPARK-28810
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28808) Document SHOW FUNCTIONS in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28808:


 Summary: Document SHOW FUNCTIONS in SQL Reference.
 Key: SPARK-28808
 URL: https://issues.apache.org/jira/browse/SPARK-28808
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28807) Document SHOW DATABASES in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28807:


 Summary: Document SHOW DATABASES in SQL Reference.
 Key: SPARK-28807
 URL: https://issues.apache.org/jira/browse/SPARK-28807
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28806) Document SHOW COLUMNS in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28806:


 Summary: Document SHOW COLUMNS in SQL Reference.
 Key: SPARK-28806
 URL: https://issues.apache.org/jira/browse/SPARK-28806
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28805) Document DESCRIBE FUNCTION in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28805:


 Summary: Document DESCRIBE FUNCTION in SQL Reference.
 Key: SPARK-28805
 URL: https://issues.apache.org/jira/browse/SPARK-28805
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28803) Document DESCRIBE TABLE in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28803:


 Summary: Document DESCRIBE TABLE in SQL Reference.
 Key: SPARK-28803
 URL: https://issues.apache.org/jira/browse/SPARK-28803
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28804) Document DESCRIBE QUERY in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28804:


 Summary: Document DESCRIBE QUERY in SQL Reference.
 Key: SPARK-28804
 URL: https://issues.apache.org/jira/browse/SPARK-28804
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28802) Document DESCRIBE DATABASE in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28802:


 Summary: Document DESCRIBE DATABASE in SQL Reference.
 Key: SPARK-28802
 URL: https://issues.apache.org/jira/browse/SPARK-28802
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28801) Document SELECT statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28801:


 Summary: Document SELECT statement in SQL Reference.
 Key: SPARK-28801
 URL: https://issues.apache.org/jira/browse/SPARK-28801
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28800) Document REPAIR TABLE statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28800:


 Summary: Document REPAIR TABLE statement in SQL Reference.
 Key: SPARK-28800
 URL: https://issues.apache.org/jira/browse/SPARK-28800
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28799) Document TRUNCATE TABLE in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28799:


 Summary: Document TRUNCATE TABLE in SQL Reference.
 Key: SPARK-28799
 URL: https://issues.apache.org/jira/browse/SPARK-28799
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28798) Document DROP TABLE/VIEW statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28798:


 Summary: Document DROP TABLE/VIEW statement in SQL Reference.
 Key: SPARK-28798
 URL: https://issues.apache.org/jira/browse/SPARK-28798
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28797) Document DROP FUNCTION statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28797:


 Summary: Document DROP FUNCTION statement in SQL Reference.
 Key: SPARK-28797
 URL: https://issues.apache.org/jira/browse/SPARK-28797
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28796) Document DROP DATABASE statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28796:


 Summary: Document DROP DATABASE statement in SQL Reference.
 Key: SPARK-28796
 URL: https://issues.apache.org/jira/browse/SPARK-28796
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28795) Document CREATE VIEW statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28795:


 Summary: Document CREATE VIEW statement in SQL Reference.
 Key: SPARK-28795
 URL: https://issues.apache.org/jira/browse/SPARK-28795
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28794) Document CREATE TABLE in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28794:


 Summary: Document CREATE TABLE in SQL Reference.
 Key: SPARK-28794
 URL: https://issues.apache.org/jira/browse/SPARK-28794
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28793) Document CREATE FUNCTION in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28793:


 Summary: Document CREATE FUNCTION in SQL Reference.
 Key: SPARK-28793
 URL: https://issues.apache.org/jira/browse/SPARK-28793
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28792) Document CREATE DATABASE statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28792:


 Summary: Document CREATE DATABASE statement in SQL Reference.
 Key: SPARK-28792
 URL: https://issues.apache.org/jira/browse/SPARK-28792
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28791) Document ALTER TABLE statement in SQL Reference.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28791:


 Summary: Document ALTER TABLE statement in SQL Reference.
 Key: SPARK-28791
 URL: https://issues.apache.org/jira/browse/SPARK-28791
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-28789) Document ALTER DATABASE statement.

2019-08-20 Thread Dilip Biswal (Jira)

Dilip Biswal created SPARK-28789:


 Summary: Document ALTER DATABASE statement.
 Key: SPARK-28789
 URL: https://issues.apache.org/jira/browse/SPARK-28789
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

RE: Release Spark 2.3.4

2019-08-17 Thread Dilip Biswal

+1
 
Regards,Dilip BiswalTel: 408-463-4980dbis...@us.ibm.com
 
 
- Original message -From: John Zhuge To: Xiao Li Cc: Takeshi Yamamuro , Spark dev list , Kazuaki Ishizaki Subject: [EXTERNAL] Re: Release Spark 2.3.4Date: Fri, Aug 16, 2019 4:33 PM 
+1 

On Fri, Aug 16, 2019 at 4:25 PM Xiao Li  wrote:
+1 

On Fri, Aug 16, 2019 at 4:11 PM Takeshi Yamamuro  wrote:
+1, too
 
Bests,
Takeshi 

On Sat, Aug 17, 2019 at 7:25 AM Dongjoon Hyun  wrote:
+1 for 2.3.4 release as the last release for `branch-2.3` EOL.
 
Also, +1 for next week release.
 
Bests,
Dongjoon.
  

On Fri, Aug 16, 2019 at 8:19 AM Sean Owen  wrote:
I think it's fine to do these in parallel, yes. Go ahead if you are willing.On Fri, Aug 16, 2019 at 9:48 AM Kazuaki Ishizaki  wrote:>> Hi, All.>> Spark 2.3.3 was released six months ago (15th February, 2019) at http://spark.apache.org/news/spark-2-3-3-released.html. And, about 18 months have been passed after Spark 2.3.0 has been released (28th February, 2018).> As of today (16th August), there are 103 commits (69 JIRAs) in `branch-23` since 2.3.3.>> It would be great if we can have Spark 2.3.4.> If it is ok, shall we start `2.3.4 RC1` concurrent with 2.4.4 or after 2.4.4 will be released?>> A issue list in jira: https://issues.apache.org/jira/projects/SPARK/versions/12344844> A commit list in github from the last release: https://github.com/apache/spark/compare/66fd9c34bf406a4b5f86605d06c9607752bd637a...branch-2.3> The 8 correctness issues resolved in branch-2.3:> https://issues.apache.org/jira/browse/SPARK-26873?jql=project%20%3D%2012315420%20AND%20fixVersion%20%3D%2012344844%20AND%20labels%20in%20(%27correctness%27)%20ORDER%20BY%20priority%20DESC%2C%20key%20ASC>> Best Regards,> Kazuaki Ishizaki-To unsubscribe e-mail: dev-unsubscr...@spark.apache.org  

 --

---Takeshi Yamamuro 

 --

  

 --

John Zhuge
 


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

[jira] [Created] (SPARK-28734) Create a table of content in the left hand side bar for SQL doc.

2019-08-14 Thread Dilip Biswal (JIRA)

Dilip Biswal created SPARK-28734:


 Summary: Create a table of content in the left hand side bar for 
SQL doc.
 Key: SPARK-28734
 URL: https://issues.apache.org/jira/browse/SPARK-28734
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-07-03 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878185#comment-16878185
 ] 

Dilip Biswal commented on SPARK-27768:
--

[~dongjoon] Thank you very much for your response.

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-05-27 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849078#comment-16849078
 ] 

Dilip Biswal commented on SPARK-27768:
--

[~dongjoon]
 Thanks for trying out Presto.

Just want to share my 2 cents before we take a final call on it. I am okay with 
whatever you guys decide :).
 There seems to be a subtle difference between Presto and Spark ? Spark returns 
"NULL" in this case where as presto returns an error ? Because of this i think 
we should be more accommodative of data that is accepted in other systems. I am 
afraid, because of "authoring null" semantics, sometimes during the etl process 
we will treat some valid input from other systems as nulls and its probably 
hard for users to locate the bad record and fix..

Lets say for a second that we decide to accept this case. So technically, we 
will not be portable with Hive and Presto. But we are allowing something more 
that these two systems, right ? Do we think that some users would actually want 
the strings such as "infinity" to be treated as null and would be negatively 
surprised to see the new behaviour ? 
 Let me know what you think..

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-05-21 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845277#comment-16845277
 ] 

Dilip Biswal edited comment on SPARK-27768 at 5/21/19 9:48 PM:
---

[~smilegator] [~dongjoon] will wait for the test PR to get in first and then i 
will open the pr the infinity issue.


was (Author: dkbiswal):
[~smilegator] I will wait for the test PR to get in first and then i will open 
the pr the infinity issue.

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-05-21 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845277#comment-16845277
 ] 

Dilip Biswal commented on SPARK-27768:
--

[~smilegator] I will wait for the test PR to get in first and then i will open 
the pr the infinity issue.

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-05-20 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844473#comment-16844473
 ] 

Dilip Biswal commented on SPARK-27768:
--

OK [~smilegator]

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-05-20 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844372#comment-16844372
 ] 

Dilip Biswal commented on SPARK-27768:
--

Hi [~smilegator] and [~dongjoon],

Just checked the DB2 behaviour. FYI.

db2 => select cast('infinity' as decfloat) from foo; 
1                                         
--
                                  Infinity
  1 record(s) selected.
db2 => select cast('Infinity' as decfloat) from foo;
1                                         
--
                                  Infinity
  1 record(s) selected.
db2 => select cast('iNfinity' as decfloat) from foo;
1                                         
--
                                  Infinity 
  1 record(s) selected.

Regards,
-- Dilip

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27768) Infinity, -Infinity, NaN should be recognized in a case insensitive manner

2019-05-20 Thread Dilip Biswal (JIRA)



[ 
https://issues.apache.org/jira/browse/SPARK-27768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844351#comment-16844351
 ] 

Dilip Biswal commented on SPARK-27768:
--

Hi [~smilegator] [~dongjoon],

Had started to look at this. A few observations :

1) Apart from handling of cast string to double/float, there seem to be another 
discrepancy with type promotion. 
     In the first test case, while handling the inline table, we can't find a 
common type between string and double.
     Not sure if we want to do anything about it ?
 2) Seems like, we need to modify the casting code to handle this i.e handle 
this at the runtime.
     Made a trial at  
[here|https://github.com/dilipbiswal/spark/pull/new/double_infinity]
 (Note: I have tweaked the first test to use int literal as opposed to 
string literal to bypass (1)

Not sure what the direction we would like to take.. Just wanted to share my 
findings.

Regards,
Dilip

> Infinity, -Infinity, NaN should be recognized in a case insensitive manner
> --
>
> Key: SPARK-27768
> URL: https://issues.apache.org/jira/browse/SPARK-27768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> When the inputs contain the constant 'infinity', Spark SQL does not generate 
> the expected results.
> {code:java}
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('1'), (CAST('infinity' AS DOUBLE))) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('1')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('infinity'), ('infinity')) v(x);
> SELECT avg(CAST(x AS DOUBLE)), var_pop(CAST(x AS DOUBLE))
> FROM (VALUES ('-infinity'), ('infinity')) v(x);{code}
>  The root cause: Spark SQL does not recognize the special constants in a case 
> insensitive way. In PostgreSQL, they are recognized in a case insensitive 
> way. 
> Link: https://www.postgresql.org/docs/9.3/datatype-numeric.html 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-27782) Use '#' to mark expression id embedded in the subquery name in the SubqueryExec operator.

2019-05-20 Thread Dilip Biswal (JIRA)

Dilip Biswal created SPARK-27782:


 Summary: Use '#' to mark expression id embedded in the subquery 
name in the SubqueryExec operator.
 Key: SPARK-27782
 URL: https://issues.apache.org/jira/browse/SPARK-27782
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.3
Reporter: Dilip Biswal


Use "#" to mark the expression id in the name field of `SubqueryExec` operator. 
Currently in SQLQueryTestSuite we anonymize the expression ids in the output 
file for comparison purposes. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-27531) Improve explain output of describe table command to show the inputs to the command.

2019-04-20 Thread Dilip Biswal (JIRA)

Dilip Biswal created SPARK-27531:


 Summary: Improve explain output of describe table command to show 
the inputs to the command.
 Key: SPARK-27531
 URL: https://issues.apache.org/jira/browse/SPARK-27531
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.1
Reporter: Dilip Biswal


Currently "EXPLAIN DESC TABLE" is special cased and outputs a single row 
relation as following. This is not consistent with how we handle explain 
processing for other commands. 



Current output :
{code:java}
spark-sql> EXPLAIN DESCRIBE TABLE t1;
== Physical Plan ==
*(1) Scan OneRowRelation[]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-27480) Improve explain output of describe query command to show the actual input query as opposed to a truncated logical plan.

2019-04-16 Thread Dilip Biswal (JIRA)

Dilip Biswal created SPARK-27480:


 Summary: Improve explain output of describe query command to show 
the actual input query as opposed to a truncated logical plan.
 Key: SPARK-27480
 URL: https://issues.apache.org/jira/browse/SPARK-27480
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.1
Reporter: Dilip Biswal


Currently running explain on describe query gives a little confusing output. 
Instead of showing the actual query that is input by the user, it shows the 
truncated logical plan as the input. We should improve it to show the query 
text as input by user.

Here are the sample outputs of the explain command.

 
{code:java}
EXPLAIN DESCRIBE WITH s AS (SELECT 'hello' as col1) SELECT * FROM s;
== Physical Plan ==
Execute DescribeQueryCommand
   +- DescribeQueryCommand CTE [s]
{code}
{code:java}
EXPLAIN EXTENDED DESCRIBE SELECT * from s1 where c1 > 0;
== Physical Plan ==
Execute DescribeQueryCommand
   +- DescribeQueryCommand 'Project [*]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

1 2 3 >

1 - 100 of 285 matches

Mail list logo