[jira] [Updated] (SPARK-41945) Python: connect client lost column data with pyarrow.Table.to_pylist

2023-01-08 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41945: --- Description: Python: connect client should not use pyarrow.Table.to_pylist to transform fetched

[jira] [Updated] (SPARK-41945) Python: connect client lost column data with pyarrow.Table.to_pylist

2023-01-08 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41945: --- Description: Python: connect client should not use pyarrow.Table.to_pylist to transform fetched

[jira] [Commented] (SPARK-41945) Python: connect client lost column data with pyarrow.Table.to_pylist

2023-01-08 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655911#comment-17655911 ] jiaan.geng commented on SPARK-41945: I'm working on. > Python: connect client lost column data with

[jira] [Created] (SPARK-41945) Python: connect client lost column data with pyarrow.Table.to_pylist

2023-01-08 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41945: -- Summary: Python: connect client lost column data with pyarrow.Table.to_pylist Key: SPARK-41945 URL: https://issues.apache.org/jira/browse/SPARK-41945 Project: Spark

[jira] (SPARK-41904) Fix Function `nth_value` functions output

2023-01-07 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41904 ] jiaan.geng deleted comment on SPARK-41904: was (Author: beliefer): [~techaddict]Could you tell me how to reproduce this issue? I want take a look! > Fix Function `nth_value` functions

[jira] [Commented] (SPARK-41904) Fix Function `nth_value` functions output

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655629#comment-17655629 ] jiaan.geng commented on SPARK-41904: [~techaddict]Could you tell me how to reproduce this issue? I

[jira] [Comment Edited] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655434#comment-17655434 ] jiaan.geng edited comment on SPARK-41824 at 1/6/23 12:56 PM: - I found scala

[jira] [Comment Edited] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655434#comment-17655434 ] jiaan.geng edited comment on SPARK-41824 at 1/6/23 12:56 PM: - I found scala

[jira] [Commented] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655434#comment-17655434 ] jiaan.geng commented on SPARK-41824: I found scala API Dataset.explain print the same output as

[jira] [Commented] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655422#comment-17655422 ] jiaan.geng commented on SPARK-41824: [~techaddict] I runed `./python/run-tests --testnames

[jira] [Commented] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655417#comment-17655417 ] jiaan.geng commented on SPARK-41824: Thank you. I will investigate it. > Implement

[jira] (SPARK-41879) `DataFrame.collect` should support nested types

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41879 ] jiaan.geng deleted comment on SPARK-41879: was (Author: beliefer): I will take a look! > `DataFrame.collect` should support nested types > --- >

[jira] [Commented] (SPARK-41879) `DataFrame.collect` should support nested types

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655350#comment-17655350 ] jiaan.geng commented on SPARK-41879: I will take a look! > `DataFrame.collect` should support

[jira] [Comment Edited] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654852#comment-17654852 ] jiaan.geng edited comment on SPARK-41824 at 1/6/23 8:17 AM: I can't

[jira] [Commented] (SPARK-41875) Throw proper errors in Dataset.to()

2023-01-05 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655203#comment-17655203 ] jiaan.geng commented on SPARK-41875: I will take a look! > Throw proper errors in Dataset.to() >

[jira] [Commented] (SPARK-41824) Implement DataFrame.explain format to be similar to PySpark

2023-01-05 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654852#comment-17654852 ] jiaan.geng commented on SPARK-41824: I will take a look! > Implement DataFrame.explain format to be

[jira] [Comment Edited] (SPARK-41875) Throw proper errors in Dataset.to()

2023-01-05 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654845#comment-17654845 ] jiaan.geng edited comment on SPARK-41875 at 1/5/23 8:00 AM: It seems this

[jira] [Comment Edited] (SPARK-41875) Throw proper errors in Dataset.to()

2023-01-05 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654845#comment-17654845 ] jiaan.geng edited comment on SPARK-41875 at 1/5/23 7:59 AM: It seems this

[jira] [Commented] (SPARK-41875) Throw proper errors in Dataset.to()

2023-01-04 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654845#comment-17654845 ] jiaan.geng commented on SPARK-41875: It seems this is't the issue of connect. > Throw proper errors

[jira] [Updated] (SPARK-41888) Support StreamingQueryListener for DataFrame.observe

2023-01-04 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41888: --- Summary: Support StreamingQueryListener for DataFrame.observe (was: Support StreamingQueryListener

[jira] [Updated] (SPARK-41888) Support StreamingQueryListener for connect

2023-01-04 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41888: --- Description: {code:java} **

[jira] [Created] (SPARK-41888) Support StreamingQueryListener for connect

2023-01-04 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41888: -- Summary: Support StreamingQueryListener for connect Key: SPARK-41888 URL: https://issues.apache.org/jira/browse/SPARK-41888 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-41066) Implement `DataFrame.sampleBy ` and `DataFrame.stat.sampleBy `

2022-12-31 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17653293#comment-17653293 ] jiaan.geng commented on SPARK-41066: I'll try this. > Implement `DataFrame.sampleBy ` and

[jira] (SPARK-41065) Implement `DataFrame.freqItems ` and `DataFrame.stat.freqItems `

2022-12-31 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41065 ] jiaan.geng deleted comment on SPARK-41065: was (Author: beliefer): I'll try this. > Implement `DataFrame.freqItems ` and `DataFrame.stat.freqItems ` >

[jira] (SPARK-41069) Implement `DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile`

2022-12-27 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41069 ] jiaan.geng deleted comment on SPARK-41069: was (Author: beliefer): I will try this. > Implement `DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile` >

[jira] [Commented] (SPARK-41069) Implement `DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile`

2022-12-27 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652355#comment-17652355 ] jiaan.geng commented on SPARK-41069: I will try this. > Implement `DataFrame.approxQuantile` and

[jira] [Created] (SPARK-41736) pyspark_types_to_proto_types should supports ArrayType

2022-12-27 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41736: -- Summary: pyspark_types_to_proto_types should supports ArrayType Key: SPARK-41736 URL: https://issues.apache.org/jira/browse/SPARK-41736 Project: Spark Issue

[jira] [Commented] (SPARK-41065) Implement `DataFrame.freqItems ` and `DataFrame.stat.freqItems `

2022-12-27 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652239#comment-17652239 ] jiaan.geng commented on SPARK-41065: I'll try this. > Implement `DataFrame.freqItems ` and

[jira] (SPARK-41068) Implement `DataFrame.stat.corr`

2022-12-27 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41068 ] jiaan.geng deleted comment on SPARK-41068: was (Author: beliefer): I will try. > Implement `DataFrame.stat.corr` > --- > > Key: SPARK-41068 >

[jira] (SPARK-41067) Implement `DataFrame.stat.cov`

2022-12-27 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41067 ] jiaan.geng deleted comment on SPARK-41067: was (Author: beliefer): I will try it ! > Implement `DataFrame.stat.cov` > -- > > Key: SPARK-41067 >

[jira] [Commented] (SPARK-41068) Implement `DataFrame.stat.corr`

2022-12-26 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652173#comment-17652173 ] jiaan.geng commented on SPARK-41068: I will try. > Implement `DataFrame.stat.corr` >

[jira] [Comment Edited] (SPARK-41067) Implement `DataFrame.stat.cov`

2022-12-26 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651977#comment-17651977 ] jiaan.geng edited comment on SPARK-41067 at 12/26/22 9:18 AM: -- I will try

[jira] [Commented] (SPARK-41067) Implement `DataFrame.stat.cov`

2022-12-26 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651977#comment-17651977 ] jiaan.geng commented on SPARK-41067: I will try ! > Implement `DataFrame.stat.cov` >

[jira] [Created] (SPARK-41706) pyspark_types_to_proto_types should supports MapType

2022-12-25 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41706: -- Summary: pyspark_types_to_proto_types should supports MapType Key: SPARK-41706 URL: https://issues.apache.org/jira/browse/SPARK-41706 Project: Spark Issue Type:

[jira] (SPARK-41464) Implement DataFrame.to

2022-12-25 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41464 ] jiaan.geng deleted comment on SPARK-41464: was (Author: beliefer): OK > Implement DataFrame.to > -- > > Key: SPARK-41464 > URL:

[jira] [Updated] (SPARK-41674) Runtime filter should supports multi level shuffle join side as filter creation side

2022-12-22 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41674: --- Summary: Runtime filter should supports multi level shuffle join side as filter creation side

[jira] [Created] (SPARK-41674) Runtime filter should supports the any side of child join as filter creation side

2022-12-21 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41674: -- Summary: Runtime filter should supports the any side of child join as filter creation side Key: SPARK-41674 URL: https://issues.apache.org/jira/browse/SPARK-41674

[jira] (SPARK-41546) pyspark_types_to_proto_types should supports StructType.

2022-12-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41546 ] jiaan.geng deleted comment on SPARK-41546: was (Author: beliefer): I'm working on. > pyspark_types_to_proto_types should supports StructType. >

[jira] (SPARK-41527) Implement DataFrame.observe

2022-12-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41527 ] jiaan.geng deleted comment on SPARK-41527: was (Author: beliefer): I'm working on. > Implement DataFrame.observe > --- > > Key: SPARK-41527 >

[jira] [Created] (SPARK-41546) pyspark_types_to_proto_types should supports StructType.

2022-12-16 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41546: -- Summary: pyspark_types_to_proto_types should supports StructType. Key: SPARK-41546 URL: https://issues.apache.org/jira/browse/SPARK-41546 Project: Spark Issue

[jira] [Commented] (SPARK-41546) pyspark_types_to_proto_types should supports StructType.

2022-12-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648615#comment-17648615 ] jiaan.geng commented on SPARK-41546: I'm working on. > pyspark_types_to_proto_types should supports

[jira] [Commented] (SPARK-41527) Implement DataFrame.observe

2022-12-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648395#comment-17648395 ] jiaan.geng commented on SPARK-41527: I'm working on. > Implement DataFrame.observe >

[jira] [Commented] (SPARK-41464) Implement DataFrame.to

2022-12-15 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647948#comment-17647948 ] jiaan.geng commented on SPARK-41464: OK > Implement DataFrame.to > -- > >

[jira] [Commented] (SPARK-41453) Implement DataFrame.subtract

2022-12-15 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647907#comment-17647907 ] jiaan.geng commented on SPARK-41453: OK. > Implement DataFrame.subtract >

[jira] [Updated] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41509: --- Description: Currently, Spark runtime filter supports bloom filter and in subquery filter. The in

[jira] [Created] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41509: -- Summary: Delay execution hash until after aggregation for semi-join runtime filter. Key: SPARK-41509 URL: https://issues.apache.org/jira/browse/SPARK-41509 Project:

[jira] (SPARK-41438) Implement DataFrame. colRegex

2022-12-12 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41438 ] jiaan.geng deleted comment on SPARK-41438: was (Author: beliefer): I'm working on. > Implement DataFrame. colRegex > - > > Key: SPARK-41438 >

[jira] [Commented] (SPARK-41438) Implement DataFrame. colRegex

2022-12-11 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645942#comment-17645942 ] jiaan.geng commented on SPARK-41438: I'm working on. > Implement DataFrame. colRegex >

[jira] (SPARK-41440) Implement DataFrame.randomSplit

2022-12-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41440 ] jiaan.geng deleted comment on SPARK-41440: was (Author: beliefer): I'm working on. > Implement DataFrame.randomSplit > --- > > Key: SPARK-41440 >

[jira] [Commented] (SPARK-41440) Implement DataFrame.randomSplit

2022-12-08 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645066#comment-17645066 ] jiaan.geng commented on SPARK-41440: I'm working on. > Implement DataFrame.randomSplit >

[jira] [Updated] (SPARK-41439) Implement `DataFrame.melt` and `DataFrame.unpivot`

2022-12-08 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41439: --- Summary: Implement `DataFrame.melt` and `DataFrame.unpivot` (was: Implement `DataFrame.melt`) >

[jira] (SPARK-41439) Implement `DataFrame.melt`

2022-12-07 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41439 ] jiaan.geng deleted comment on SPARK-41439: was (Author: beliefer): I'm working on. > Implement `DataFrame.melt` > -- > > Key: SPARK-41439 >

[jira] [Commented] (SPARK-41439) Implement `DataFrame.melt`

2022-12-07 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644258#comment-17644258 ] jiaan.geng commented on SPARK-41439: I'm working on. > Implement `DataFrame.melt` >

[jira] (SPARK-41438) Implement DataFrame. colRegex

2022-12-07 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41438 ] jiaan.geng deleted comment on SPARK-41438: was (Author: beliefer): I'm working on. > Implement DataFrame. colRegex > - > > Key: SPARK-41438 >

[jira] [Commented] (SPARK-41438) Implement DataFrame. colRegex

2022-12-07 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644242#comment-17644242 ] jiaan.geng commented on SPARK-41438: I'm working on. > Implement DataFrame. colRegex >

[jira] [Commented] (SPARK-41403) Implement DataFrame.describe

2022-12-06 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643733#comment-17643733 ] jiaan.geng commented on SPARK-41403: [~podongfeng] Thank you for your ping. I will try to do this!

[jira] [Resolved] (SPARK-41337) Add a physical rule to remove the partialLimitExec node

2022-12-02 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng resolved SPARK-41337. Resolution: Invalid > Add a physical rule to remove the partialLimitExec node >

[jira] [Created] (SPARK-41337) Add a physical rule to remove the partialLimitExec node

2022-11-30 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41337: -- Summary: Add a physical rule to remove the partialLimitExec node Key: SPARK-41337 URL: https://issues.apache.org/jira/browse/SPARK-41337 Project: Spark Issue

[jira] [Updated] (SPARK-41171) Push down filter through window when partitionSpec is empty

2022-11-17 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41171: --- Description: Sometimes, filter compares the rank-like window functions with number. {code:java}

[jira] [Created] (SPARK-41171) Push down filter through window when partitionSpec is empty

2022-11-17 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41171: -- Summary: Push down filter through window when partitionSpec is empty Key: SPARK-41171 URL: https://issues.apache.org/jira/browse/SPARK-41171 Project: Spark

[jira] [Updated] (SPARK-40986) Add aggregate to reduce the data size for bloom filter

2022-11-01 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40986: --- Summary: Add aggregate to reduce the data size for bloom filter (was: Using distinct to reduce the

[jira] [Updated] (SPARK-40986) Using distinct to reduce the data size for bloom filter

2022-11-01 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40986: --- Summary: Using distinct to reduce the data size for bloom filter (was: Add extra aggregate on join

[jira] [Created] (SPARK-40986) Add extra aggregate on join key for bloom filter

2022-11-01 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40986: -- Summary: Add extra aggregate on join key for bloom filter Key: SPARK-40986 URL: https://issues.apache.org/jira/browse/SPARK-40986 Project: Spark Issue Type:

[jira] [Updated] (SPARK-40909) Reuse the broadcast exchange for bloom filter

2022-10-25 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40909: --- Description: Currently, if the creation side of bloom filter could be broadcasted, Spark cannot

[jira] [Updated] (SPARK-40909) Reuse the broadcast exchange for bloom filter

2022-10-25 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40909: --- Description: Currently, if the creation side of bloom filter could be broadcasted, Spark cannot

[jira] [Updated] (SPARK-40909) Reuse the broadcast exchange for bloom filter

2022-10-25 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40909: --- Description: Currently, if the creation side of bloom filter could be broadcasted, Spark cannot

[jira] [Created] (SPARK-40909) Reuse the broadcast exchange for bloom filter

2022-10-25 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40909: -- Summary: Reuse the broadcast exchange for bloom filter Key: SPARK-40909 URL: https://issues.apache.org/jira/browse/SPARK-40909 Project: Spark Issue Type:

[jira] [Updated] (SPARK-40716) Relax the restrictions of broadcast join on bloom filter

2022-10-09 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40716: --- Description: Currently, if the creation side of bloom filter could be broadcasted, Spark cannot

[jira] [Updated] (SPARK-40716) Relax the restrictions of broadcast join on bloom filter

2022-10-09 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40716: --- Description: Currently, > Relax the restrictions of broadcast join on bloom filter >

[jira] [Created] (SPARK-40716) Relax the restrictions of broadcast join on bloom filter

2022-10-09 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40716: -- Summary: Relax the restrictions of broadcast join on bloom filter Key: SPARK-40716 URL: https://issues.apache.org/jira/browse/SPARK-40716 Project: Spark Issue

[jira] [Created] (SPARK-40611) Improve the performance for setInterval & getInterval of UnsafeRow

2022-09-29 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40611: -- Summary: Improve the performance for setInterval & getInterval of UnsafeRow Key: SPARK-40611 URL: https://issues.apache.org/jira/browse/SPARK-40611 Project: Spark

[jira] [Updated] (SPARK-40491) Remove too old TODO for JdbcRDD

2022-09-20 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Summary: Remove too old TODO for JdbcRDD (was: Remove too old todo comments for JdbcRDD) > Remove

[jira] [Updated] (SPARK-40491) Remove too old todo comments for JdbcRDD

2022-09-20 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Description: According to the legacy document of JdbcRDD, we need to expose a jdbcRDD function in

[jira] [Updated] (SPARK-40491) Remove too old todo comments for JdbcRDD

2022-09-20 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Summary: Remove too old todo comments for JdbcRDD (was: Expose a jdbcRDD function in SparkContext)

[jira] [Updated] (SPARK-40491) Expose a jdbcRDD function in SparkContext

2022-09-19 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40491: --- Description: According to the legacy document of JdbcRDD, we need to expose a jdbcRDD function in

[jira] [Created] (SPARK-40491) Expose a jdbcRDD function in SparkContext

2022-09-19 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40491: -- Summary: Expose a jdbcRDD function in SparkContext Key: SPARK-40491 URL: https://issues.apache.org/jira/browse/SPARK-40491 Project: Spark Issue Type: New

[jira] [Created] (SPARK-40465) Refactor Decimal so as we can use Int128 as underlying implementation

2022-09-15 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40465: -- Summary: Refactor Decimal so as we can use Int128 as underlying implementation Key: SPARK-40465 URL: https://issues.apache.org/jira/browse/SPARK-40465 Project: Spark

[jira] [Created] (SPARK-40387) Improve the implementation of Spark Decimal

2022-09-08 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40387: -- Summary: Improve the implementation of Spark Decimal Key: SPARK-40387 URL: https://issues.apache.org/jira/browse/SPARK-40387 Project: Spark Issue Type:

[jira] [Created] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-30 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40285: -- Summary: Simplify the roundTo[Numeric] for Decimal Key: SPARK-40285 URL: https://issues.apache.org/jira/browse/SPARK-40285 Project: Spark Issue Type:

[jira] [Created] (SPARK-40275) Support casting decimal128

2022-08-30 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40275: -- Summary: Support casting decimal128 Key: SPARK-40275 URL: https://issues.apache.org/jira/browse/SPARK-40275 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-40268) Test decimal128 in UDF

2022-08-29 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40268: -- Summary: Test decimal128 in UDF Key: SPARK-40268 URL: https://issues.apache.org/jira/browse/SPARK-40268 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-40217) Support java.math.BigDecimal as an external type of Decimal128 type

2022-08-25 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40217: -- Summary: Support java.math.BigDecimal as an external type of Decimal128 type Key: SPARK-40217 URL: https://issues.apache.org/jira/browse/SPARK-40217 Project: Spark

[jira] [Created] (SPARK-40203) Add test cases for Spark Decimal

2022-08-24 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40203: -- Summary: Add test cases for Spark Decimal Key: SPARK-40203 URL: https://issues.apache.org/jira/browse/SPARK-40203 Project: Spark Issue Type: Test

[jira] (SPARK-40100) Add DataType class for Int128 type

2022-08-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40100 ] jiaan.geng deleted comment on SPARK-40100: was (Author: beliefer): I'm working on. > Add DataType class for Int128 type > -- > > Key:

[jira] [Updated] (SPARK-40100) Add DataType class for Int128 type

2022-08-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40100: --- Summary: Add DataType class for Int128 type (was: Add Int128 type) > Add DataType class for Int128

[jira] [Commented] (SPARK-40100) Add Int128 type

2022-08-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580084#comment-17580084 ] jiaan.geng commented on SPARK-40100: I'm working on. > Add Int128 type > --- > >

[jira] [Created] (SPARK-40100) Add Int128 type

2022-08-16 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40100: -- Summary: Add Int128 type Key: SPARK-40100 URL: https://issues.apache.org/jira/browse/SPARK-40100 Project: Spark Issue Type: Sub-task Components: SQL

[jira] [Updated] (SPARK-40097) Support Int128 type

2022-08-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40097: --- Description: Spark SQL today supports the Decimal data type. The implementation of Spark Decimal

[jira] [Updated] (SPARK-40097) Support Int128 type

2022-08-16 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40097: --- Attachment: Benchmark for performance comparison between Int128 and Spark decimal.pdf > Support

[jira] [Created] (SPARK-40097) Support Int128 type

2022-08-16 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-40097: -- Summary: Support Int128 type Key: SPARK-40097 URL: https://issues.apache.org/jira/browse/SPARK-40097 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-40013) DS V2 expressions should have the default toString

2022-08-12 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40013: --- Summary: DS V2 expressions should have the default toString (was: DS V2 expressions should have

[jira] [Reopened] (SPARK-40013) DS V2 expressions should have the default implementation of toString

2022-08-12 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng reopened SPARK-40013: > DS V2 expressions should have the default implementation of toString >

[jira] [Commented] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577942#comment-17577942 ] jiaan.geng commented on SPARK-40032: We are editing the design doc. > Support Decimal128 type >

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Attachment: Performance comparison between decimal128 and spark decimal benchmark.pdf > Support

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Attachment: (was: Performance comparison between decimal128 and spark decimal benchmark.pdf) >

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Attachment: Performance comparison between decimal128 and spark decimal benchmark.pdf > Support

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Attachment: (was: Decimal128与Spark Decimal性能比较Benchmark.pdf) > Support Decimal128 type >

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Attachment: Decimal128与Spark Decimal性能比较Benchmark.pdf > Support Decimal128 type >

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Description: Spark SQL today supports the DECIMAL data type. The implementation of Decimal that

[jira] [Updated] (SPARK-40032) Support Decimal128 type

2022-08-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-40032: --- Description: Spark SQL today supports the DECIMAL data type. The implementation of Decimal that

<    1   2   3   4   5   6   7   8   9   10   >