[jira] [Resolved] (SPARK-42288) Expose file path if reading failed

2023-04-21 Thread yikaifei (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yikaifei resolved SPARK-42288.
--
Resolution: Duplicate

> Expose file path if reading failed
> --
>
> Key: SPARK-42288
> URL: https://issues.apache.org/jira/browse/SPARK-42288
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: yikaifei
>Priority: Minor
>
> `MalformedInputException` may be thrown because the decompression failed when 
> reading the file. In this case, the error message does not contain the file 
> name. If the file name is included, it is easier to locate the problem.
> {code:java}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 41 in 
> stage 15641.0 failed 10 times, most recent failure: Lost task 41.9 in stage 
> 15641.0 (TID 6287211) (hostname executor 58): 
> io.airlift.compress.MalformedInputException: Malformed input: offset=65075
>   at 
> io.airlift.compress.snappy.SnappyRawDecompressor.uncompressAll(SnappyRawDecompressor.java:108)
>   at 
> io.airlift.compress.snappy.SnappyRawDecompressor.decompress(SnappyRawDecompressor.java:53)
>   at 
> io.airlift.compress.snappy.SnappyDecompressor.decompress(SnappyDecompressor.java:45)
>   at 
> org.apache.orc.impl.AircompressorCodec.decompress(AircompressorCodec.java:94)
>   at org.apache.orc.impl.SnappyCodec.decompress(SnappyCodec.java:45)
>   at 
> org.apache.orc.impl.InStream$CompressedStream.readHeader(InStream.java:495)
>   at 
> org.apache.orc.impl.InStream$CompressedStream.ensureUncompressed(InStream.java:522)
>   at org.apache.orc.impl.InStream$CompressedStream.read(InStream.java:509)
>   at 
> org.apache.orc.impl.SerializationUtils.readRemainingLongs(SerializationUtils.java:1102)
>   at 
> org.apache.orc.impl.SerializationUtils.unrolledUnPackBytes(SerializationUtils.java:1094)
>   at 
> org.apache.orc.impl.SerializationUtils.unrolledUnPack32(SerializationUtils.java:1059)
>   at 
> org.apache.orc.impl.SerializationUtils.readInts(SerializationUtils.java:925)
>   at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:268)
>   at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:69)
>   at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:323)
>   at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:373)
>   at 
> org.apache.orc.impl.TreeReaderFactory$LongTreeReader.nextVector(TreeReaderFactory.java:641)
>   at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:2047)
>   at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1219)
>   at 
> org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:197)
>   at 
> org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99)
>   at 
> org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)
>   at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
>   at 
> org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:522)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage8.columnartorow_nextBatch_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage8.agg_doAggregateWithKeys_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage8.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
>   at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
>   at 
> org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:179)
>   at 
> org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
>   at org.apache.spark.scheduler.Task.run(Task.scala:131)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:510)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:513)

[jira] [Created] (SPARK-43228) Join keys also match PartitioningCollection

2023-04-21 Thread Yuming Wang (Jira)
Yuming Wang created SPARK-43228:
---

 Summary: Join keys also match PartitioningCollection
 Key: SPARK-43228
 URL: https://issues.apache.org/jira/browse/SPARK-43228
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.5.0
Reporter: Yuming Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-43229) Support Barrier Python UDF

2023-04-21 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-43229:
-

 Summary: Support Barrier Python UDF
 Key: SPARK-43229
 URL: https://issues.apache.org/jira/browse/SPARK-43229
 Project: Spark
  Issue Type: New Feature
  Components: Connect, ML, PySpark
Affects Versions: 3.5.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43156) Correctness COUNT bug in correlated scalar subselect with `COUNT(*) is null`

2023-04-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714903#comment-17714903
 ] 

ASF GitHub Bot commented on SPARK-43156:


User 'Hisoka-X' has created a pull request for this issue:
https://github.com/apache/spark/pull/40865

> Correctness COUNT bug in correlated scalar subselect with `COUNT(*) is null`
> 
>
> Key: SPARK-43156
> URL: https://issues.apache.org/jira/browse/SPARK-43156
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Jack Chen
>Priority: Major
>
> Example query:
> {code:java}
> spark.sql("select *, (select (count(1)) is null from t1 where t0.a = t1.c) 
> from t0").collect()
> res6: Array[org.apache.spark.sql.Row] = Array([1,1.0,null], [2,2.0,false])  
> {code}
> In this subquery, count(1) always evaluates to a non-null integer value, so 
> count(1) is null is always false. The correct evaluation of the subquery is 
> always false.
> We incorrectly evaluate it to null for empty groups. The reason is that 
> NullPropagation rewrites Aggregate [c] [isnull(count(1))] to Aggregate [c] 
> [false] - this rewrite would be correct normally, but in the context of a 
> scalar subquery it breaks our count bug handling in 
> RewriteCorrelatedScalarSubquery.constructLeftJoins . By the time we get 
> there, the query appears to not have the count bug - it looks the same as if 
> the original query had a subquery with select any_value(false) from r..., and 
> that case is _not_ subject to the count bug.
>  
> Postgres comparison show correct always-false result: 
> [http://sqlfiddle.com/#!17/67822/5]
> DDL for the example:
> {code:java}
> create or replace temp view t0 (a, b)
> as values
>     (1, 1.0),
>     (2, 2.0);
> create or replace temp view t1 (c, d)
> as values
>     (2, 3.0); {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43229) Support Barrier Python UDF

2023-04-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714902#comment-17714902
 ] 

ASF GitHub Bot commented on SPARK-43229:


User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/40896

> Support Barrier Python UDF
> --
>
> Key: SPARK-43229
> URL: https://issues.apache.org/jira/browse/SPARK-43229
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect, ML, PySpark
>Affects Versions: 3.5.0
>Reporter: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43128) Streaming progress struct (especially in Scala)

2023-04-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714904#comment-17714904
 ] 

ASF GitHub Bot commented on SPARK-43128:


User 'bogao007' has created a pull request for this issue:
https://github.com/apache/spark/pull/40895

> Streaming progress struct (especially in Scala)
> ---
>
> Key: SPARK-43128
> URL: https://issues.apache.org/jira/browse/SPARK-43128
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Raghu Angadi
>Priority: Major
>
> Streaming spark connect transfers streaming progress as full “json”.
> This works ok for Python since it does not have any schema defined. 
> But in Scala, it is a full fledged class. We need to decide if we want to 
> match legacy Progress struct in spark-connect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-43230) Simplify `DataFrameNaFunctions.fillna`

2023-04-21 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-43230:
-

 Summary: Simplify `DataFrameNaFunctions.fillna`
 Key: SPARK-43230
 URL: https://issues.apache.org/jira/browse/SPARK-43230
 Project: Spark
  Issue Type: New Feature
  Components: Connect
Affects Versions: 3.5.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43230) Simplify `DataFrameNaFunctions.fillna`

2023-04-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714906#comment-17714906
 ] 

ASF GitHub Bot commented on SPARK-43230:


User 'zhengruifeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/40898

> Simplify `DataFrameNaFunctions.fillna`
> --
>
> Key: SPARK-43230
> URL: https://issues.apache.org/jira/browse/SPARK-43230
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Ruifeng Zheng
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43228) Join keys also match PartitioningCollection

2023-04-21 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714907#comment-17714907
 ] 

Yuming Wang commented on SPARK-43228:
-

https://github.com/apache/spark/pull/40897

> Join keys also match PartitioningCollection
> ---
>
> Key: SPARK-43228
> URL: https://issues.apache.org/jira/browse/SPARK-43228
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Yuming Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43228) Join keys also match PartitioningCollection

2023-04-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714909#comment-17714909
 ] 

ASF GitHub Bot commented on SPARK-43228:


User 'wangyum' has created a pull request for this issue:
https://github.com/apache/spark/pull/40897

> Join keys also match PartitioningCollection
> ---
>
> Key: SPARK-43228
> URL: https://issues.apache.org/jira/browse/SPARK-43228
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Yuming Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43199) Make InlineCTE idempotent

2023-04-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714910#comment-17714910
 ] 

ASF GitHub Bot commented on SPARK-43199:


User 'peter-toth' has created a pull request for this issue:
https://github.com/apache/spark/pull/40856

> Make InlineCTE idempotent
> -
>
> Key: SPARK-43199
> URL: https://issues.apache.org/jira/browse/SPARK-43199
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-43231) Reduce the memory requirement in torch-related tests

2023-04-21 Thread Ruifeng Zheng (Jira)
Ruifeng Zheng created SPARK-43231:
-

 Summary: Reduce the memory requirement in torch-related tests
 Key: SPARK-43231
 URL: https://issues.apache.org/jira/browse/SPARK-43231
 Project: Spark
  Issue Type: Test
  Components: Connect, ML, PySpark, Tests
Affects Versions: 3.5.0
Reporter: Ruifeng Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-43232) Improve ObjectHashAggregateExec performance

2023-04-21 Thread XiDuo You (Jira)
XiDuo You created SPARK-43232:
-

 Summary: Improve ObjectHashAggregateExec performance
 Key: SPARK-43232
 URL: https://issues.apache.org/jira/browse/SPARK-43232
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.5.0
Reporter: XiDuo You


The `ObjectHashAggregateExec` has two preformance issues:

- heavy overhead of scala sugar in `createNewAggregationBuffer`

- unnecessary grouping key comparation if fallback to sort based aggregator

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-43232) Improve ObjectHashAggregateExec performance

2023-04-21 Thread XiDuo You (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiDuo You updated SPARK-43232:
--
Description: 
The `ObjectHashAggregateExec` has two preformance issues:
 - heavy overhead of scala sugar in `createNewAggregationBuffer`

 - unnecessary grouping key comparation after fallback to sort based aggregator

 

  was:
The `ObjectHashAggregateExec` has two preformance issues:

- heavy overhead of scala sugar in `createNewAggregationBuffer`

- unnecessary grouping key comparation if fallback to sort based aggregator

 


> Improve ObjectHashAggregateExec performance
> ---
>
> Key: SPARK-43232
> URL: https://issues.apache.org/jira/browse/SPARK-43232
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: XiDuo You
>Priority: Major
>
> The `ObjectHashAggregateExec` has two preformance issues:
>  - heavy overhead of scala sugar in `createNewAggregationBuffer`
>  - unnecessary grouping key comparation after fallback to sort based 
> aggregator
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42780) Upgrade google Tink to 1.9.0

2023-04-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-42780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bjørn Jørgensen updated SPARK-42780:

Summary: Upgrade google Tink to 1.9.0  (was: Upgrade google Tink from 1.7.0 
to 1.8.0)

> Upgrade google Tink to 1.9.0
> 
>
> Key: SPARK-42780
> URL: https://issues.apache.org/jira/browse/SPARK-42780
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Bjørn Jørgensen
>Priority: Major
>
> [SNYK-JAVA-COMGOOGLEPROTOBUF-3040284|https://security.snyk.io/vuln/SNYK-JAVA-COMGOOGLEPROTOBUF-3040284]
> [SNYK-JAVA-COMGOOGLEPROTOBUF-3167772|https://security.snyk.io/vuln/SNYK-JAVA-COMGOOGLEPROTOBUF-3167772]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42780) Upgrade google Tink to 1.9.0

2023-04-21 Thread Jira


[ 
https://issues.apache.org/jira/browse/SPARK-42780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714971#comment-17714971
 ] 

Bjørn Jørgensen commented on SPARK-42780:
-

https://github.com/apache/spark/pull/40878

> Upgrade google Tink to 1.9.0
> 
>
> Key: SPARK-42780
> URL: https://issues.apache.org/jira/browse/SPARK-42780
> Project: Spark
>  Issue Type: Dependency upgrade
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Bjørn Jørgensen
>Priority: Major
>
> [SNYK-JAVA-COMGOOGLEPROTOBUF-3040284|https://security.snyk.io/vuln/SNYK-JAVA-COMGOOGLEPROTOBUF-3040284]
> [SNYK-JAVA-COMGOOGLEPROTOBUF-3167772|https://security.snyk.io/vuln/SNYK-JAVA-COMGOOGLEPROTOBUF-3167772]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43142) DSL expressions fail on attribute with special characters

2023-04-21 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-43142.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

Issue resolved by pull request 40794
[https://github.com/apache/spark/pull/40794]

> DSL expressions fail on attribute with special characters
> -
>
> Key: SPARK-43142
> URL: https://issues.apache.org/jira/browse/SPARK-43142
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Willi Raschkowski
>Priority: Major
> Fix For: 3.5.0
>
>
> Expressions on implicitly converted attributes fail if the attributes have 
> names containing special characters. They fail even if the attributes are 
> backtick-quoted:
> {code:java}
> scala> import org.apache.spark.sql.catalyst.dsl.expressions._
> import org.apache.spark.sql.catalyst.dsl.expressions._
> scala> "`slashed/col`".attr
> res0: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 
> 'slashed/col
> scala> "`slashed/col`".attr.asc
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input '/' expecting {, '.', '-'}(line 1, pos 7)
> == SQL ==
> slashed/col
> ---^^^
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-43142) DSL expressions fail on attribute with special characters

2023-04-21 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-43142:
---

Assignee: Willi Raschkowski

> DSL expressions fail on attribute with special characters
> -
>
> Key: SPARK-43142
> URL: https://issues.apache.org/jira/browse/SPARK-43142
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Willi Raschkowski
>Assignee: Willi Raschkowski
>Priority: Major
> Fix For: 3.5.0
>
>
> Expressions on implicitly converted attributes fail if the attributes have 
> names containing special characters. They fail even if the attributes are 
> backtick-quoted:
> {code:java}
> scala> import org.apache.spark.sql.catalyst.dsl.expressions._
> import org.apache.spark.sql.catalyst.dsl.expressions._
> scala> "`slashed/col`".attr
> res0: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 
> 'slashed/col
> scala> "`slashed/col`".attr.asc
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input '/' expecting {, '.', '-'}(line 1, pos 7)
> == SQL ==
> slashed/col
> ---^^^
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42330) Assign name to _LEGACY_ERROR_TEMP_2175

2023-04-21 Thread Koray Beyaz (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714992#comment-17714992
 ] 

Koray Beyaz commented on SPARK-42330:
-

Working on this issue

> Assign name to _LEGACY_ERROR_TEMP_2175
> --
>
> Key: SPARK-42330
> URL: https://issues.apache.org/jira/browse/SPARK-42330
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43196) Replace reflection w/ direct calling for `ContainerLaunchContext#setTokensConf`

2023-04-21 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715030#comment-17715030
 ] 

Ignite TC Bot commented on SPARK-43196:
---

User 'pan3793' has created a pull request for this issue:
https://github.com/apache/spark/pull/40900

> Replace reflection w/ direct calling for 
> `ContainerLaunchContext#setTokensConf`
> ---
>
> Key: SPARK-43196
> URL: https://issues.apache.org/jira/browse/SPARK-43196
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-43142) DSL expressions fail on attribute with special characters

2023-04-21 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang reopened SPARK-43142:
-

> DSL expressions fail on attribute with special characters
> -
>
> Key: SPARK-43142
> URL: https://issues.apache.org/jira/browse/SPARK-43142
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Willi Raschkowski
>Assignee: Willi Raschkowski
>Priority: Major
> Fix For: 3.5.0
>
>
> Expressions on implicitly converted attributes fail if the attributes have 
> names containing special characters. They fail even if the attributes are 
> backtick-quoted:
> {code:java}
> scala> import org.apache.spark.sql.catalyst.dsl.expressions._
> import org.apache.spark.sql.catalyst.dsl.expressions._
> scala> "`slashed/col`".attr
> res0: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 
> 'slashed/col
> scala> "`slashed/col`".attr.asc
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input '/' expecting {, '.', '-'}(line 1, pos 7)
> == SQL ==
> slashed/col
> ---^^^
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-43142) DSL expressions fail on attribute with special characters

2023-04-21 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-43142:

Fix Version/s: (was: 3.5.0)

> DSL expressions fail on attribute with special characters
> -
>
> Key: SPARK-43142
> URL: https://issues.apache.org/jira/browse/SPARK-43142
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Willi Raschkowski
>Assignee: Willi Raschkowski
>Priority: Major
>
> Expressions on implicitly converted attributes fail if the attributes have 
> names containing special characters. They fail even if the attributes are 
> backtick-quoted:
> {code:java}
> scala> import org.apache.spark.sql.catalyst.dsl.expressions._
> import org.apache.spark.sql.catalyst.dsl.expressions._
> scala> "`slashed/col`".attr
> res0: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 
> 'slashed/col
> scala> "`slashed/col`".attr.asc
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input '/' expecting {, '.', '-'}(line 1, pos 7)
> == SQL ==
> slashed/col
> ---^^^
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43179) Add option for applications to control saving of metadata in the External Shuffle Service LevelDB

2023-04-21 Thread Mridul Muralidharan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-43179.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

Issue resolved by pull request 40843
[https://github.com/apache/spark/pull/40843]

> Add option for applications to control saving of metadata in the External 
> Shuffle Service LevelDB
> -
>
> Key: SPARK-43179
> URL: https://issues.apache.org/jira/browse/SPARK-43179
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 3.4.0
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.5.0
>
>
> Currently, the External Shuffle Service stores application metadata in 
> LevelDB. This is necessary to enable the shuffle server to resume serving 
> shuffle data for an application whose executors registered before the 
> NodeManager restarts. However, the metadata includes the application secret, 
> which is stored in LevelDB without encryption. This is a potential security 
> risk, particularly for applications with high security requirements. While 
> filesystem access control lists (ACLs) can help protect keys and 
> certificates, they may not be sufficient for some use cases. In response, we 
> have decided not to store metadata for these high-security applications in 
> LevelDB. As a result, these applications may experience more failures in the 
> event of a node restart, but we believe this trade-off is acceptable given 
> the increased security risk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-43179) Add option for applications to control saving of metadata in the External Shuffle Service LevelDB

2023-04-21 Thread Mridul Muralidharan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan reassigned SPARK-43179:
---

Assignee: Chandni Singh

> Add option for applications to control saving of metadata in the External 
> Shuffle Service LevelDB
> -
>
> Key: SPARK-43179
> URL: https://issues.apache.org/jira/browse/SPARK-43179
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 3.4.0
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>
> Currently, the External Shuffle Service stores application metadata in 
> LevelDB. This is necessary to enable the shuffle server to resume serving 
> shuffle data for an application whose executors registered before the 
> NodeManager restarts. However, the metadata includes the application secret, 
> which is stored in LevelDB without encryption. This is a potential security 
> risk, particularly for applications with high security requirements. While 
> filesystem access control lists (ACLs) can help protect keys and 
> certificates, they may not be sufficient for some use cases. In response, we 
> have decided not to store metadata for these high-security applications in 
> LevelDB. As a result, these applications may experience more failures in the 
> event of a node restart, but we believe this trade-off is acceptable given 
> the increased security risk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43134) Add streaming query exception API in Scala

2023-04-21 Thread Wei Liu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715148#comment-17715148
 ] 

Wei Liu commented on SPARK-43134:
-

I'm working on this

 

> Add streaming query exception API in Scala
> --
>
> Key: SPARK-43134
> URL: https://issues.apache.org/jira/browse/SPARK-43134
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Raghu Angadi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43032) Add StreamingQueryManager API

2023-04-21 Thread Wei Liu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715147#comment-17715147
 ] 

Wei Liu commented on SPARK-43032:
-

[https://github.com/apache/spark/pull/40861] still draft

 

> Add StreamingQueryManager API
> -
>
> Key: SPARK-43032
> URL: https://issues.apache.org/jira/browse/SPARK-43032
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Raghu Angadi
>Priority: Major
>
> Add StreamingQueryManager API. It would include API that can be directly 
> support. API like registering streaming listener will be handled separately. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43143) Scala: Add StreamingQuery awaitTermination() API

2023-04-21 Thread Wei Liu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715149#comment-17715149
 ] 

Wei Liu commented on SPARK-43143:
-

I'm working on this

 

> Scala: Add StreamingQuery awaitTermination() API
> 
>
> Key: SPARK-43143
> URL: https://issues.apache.org/jira/browse/SPARK-43143
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Raghu Angadi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43206) Streaming query exception() also include stack trace

2023-04-21 Thread Wei Liu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715171#comment-17715171
 ] 

Wei Liu commented on SPARK-43206:
-

I'll work on this.

To myself: don't forget jvm exceptions

 

> Streaming query exception() also include stack trace
> 
>
> Key: SPARK-43206
> URL: https://issues.apache.org/jira/browse/SPARK-43206
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Wei Liu
>Priority: Major
>
> [https://github.com/apache/spark/pull/40785#issuecomment-1515522281]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-43233) Before batch reading from Kafka, log topic partition, offset range, etc, for debuggin

2023-04-21 Thread Siying Dong (Jira)
Siying Dong created SPARK-43233:
---

 Summary: Before batch reading from Kafka, log topic partition, 
offset range, etc, for debuggin
 Key: SPARK-43233
 URL: https://issues.apache.org/jira/browse/SPARK-43233
 Project: Spark
  Issue Type: Improvement
  Components: Structured Streaming
Affects Versions: 3.4.0
Reporter: Siying Dong


When debugging some slowness issue in structured streaming, it is hard to map a 
Kafka topic and partition to a Kafka task. Adding some logging in executor 
might help make it easier.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-43233) Before batch reading from Kafka, log topic partition, offset range, etc, for debuggin

2023-04-21 Thread Siying Dong (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated SPARK-43233:

Issue Type: Task  (was: Improvement)

> Before batch reading from Kafka, log topic partition, offset range, etc, for 
> debuggin
> -
>
> Key: SPARK-43233
> URL: https://issues.apache.org/jira/browse/SPARK-43233
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming
>Affects Versions: 3.4.0
>Reporter: Siying Dong
>Priority: Trivial
>
> When debugging some slowness issue in structured streaming, it is hard to map 
> a Kafka topic and partition to a Kafka task. Adding some logging in executor 
> might help make it easier.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43206) Connect Better StreamingQueryException

2023-04-21 Thread Wei Liu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715178#comment-17715178
 ] 

Wei Liu commented on SPARK-43206:
-

Also cause, offsets, stack trace...

 

> Connect Better StreamingQueryException
> --
>
> Key: SPARK-43206
> URL: https://issues.apache.org/jira/browse/SPARK-43206
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Wei Liu
>Priority: Major
>
> [https://github.com/apache/spark/pull/40785#issuecomment-1515522281]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-43206) Connect Better StreamingQueryException

2023-04-21 Thread Wei Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Liu updated SPARK-43206:

Summary: Connect Better StreamingQueryException  (was: Streaming query 
exception() also include stack trace)

> Connect Better StreamingQueryException
> --
>
> Key: SPARK-43206
> URL: https://issues.apache.org/jira/browse/SPARK-43206
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Wei Liu
>Priority: Major
>
> [https://github.com/apache/spark/pull/40785#issuecomment-1515522281]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-38114) Spark build fails in Windows

2023-04-21 Thread Felipe (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715180#comment-17715180
 ] 

Felipe commented on SPARK-38114:


Hi, this seems a big issue. Anybody found a workaround?

> Spark build fails in Windows
> 
>
> Key: SPARK-38114
> URL: https://issues.apache.org/jira/browse/SPARK-38114
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: SOUVIK PAUL
>Priority: Major
>
> java.lang.NoSuchMethodError: 
> org.fusesource.jansi.AnsiConsole.wrapOutputStream(Ljava/io/OutputStream;)Ljava/io/OutputStream;
> jline.AnsiWindowsTerminal.detectAnsiSupport(AnsiWindowsTerminal.java:57)
> jline.AnsiWindowsTerminal.(AnsiWindowsTerminal.java:27)
>  
> A similar issue is being faced by the quarkus project with latest Maven. 
> [https://github.com/quarkusio/quarkus/issues/19491]
>  
> Upgrading the scala-maven-plugin seems to resolve the issue but this ticket 
> can be a blocker
> https://issues.apache.org/jira/browse/SPARK-36547



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43174) Fix SparkSQLCLIDriver completer

2023-04-21 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang resolved SPARK-43174.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

Issue resolved by pull request 40838
[https://github.com/apache/spark/pull/40838]

> Fix SparkSQLCLIDriver completer
> ---
>
> Key: SPARK-43174
> URL: https://issues.apache.org/jira/browse/SPARK-43174
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-43174) Fix SparkSQLCLIDriver completer

2023-04-21 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang reassigned SPARK-43174:
---

Assignee: Yuming Wang

> Fix SparkSQLCLIDriver completer
> ---
>
> Key: SPARK-43174
> URL: https://issues.apache.org/jira/browse/SPARK-43174
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43046) Implement dropDuplicatesWithinWatermark in Spark Connect

2023-04-21 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-43046.
--
Fix Version/s: 3.5.0
   Resolution: Fixed

Issue resolved by pull request 40834
[https://github.com/apache/spark/pull/40834]

> Implement dropDuplicatesWithinWatermark in Spark Connect
> 
>
> Key: SPARK-43046
> URL: https://issues.apache.org/jira/browse/SPARK-43046
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Jungtaek Lim
>Priority: Major
> Fix For: 3.5.0
>
>
> Once SPARK-42931 has merged, we will need to add 
> dropDuplicatesWithinWatermark API to Spark connect, both Python and 
> Scala/Java.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43082) Arrow-optimized Python UDFs in Spark Connect

2023-04-21 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-43082.
--
Fix Version/s: 3.5.0
   Resolution: Fixed

Issue resolved by pull request 40725
[https://github.com/apache/spark/pull/40725]

> Arrow-optimized Python UDFs in Spark Connect
> 
>
> Key: SPARK-43082
> URL: https://issues.apache.org/jira/browse/SPARK-43082
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.5.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
> Fix For: 3.5.0
>
>
> Implement Arrow-optimized Python UDFs in Spark Connect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-43082) Arrow-optimized Python UDFs in Spark Connect

2023-04-21 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-43082:


Assignee: Xinrong Meng

> Arrow-optimized Python UDFs in Spark Connect
> 
>
> Key: SPARK-43082
> URL: https://issues.apache.org/jira/browse/SPARK-43082
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.5.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
>
> Implement Arrow-optimized Python UDFs in Spark Connect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-43234) Migrate ValueError from Connect DataFrame into error class

2023-04-21 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-43234:
---

 Summary: Migrate ValueError from Connect DataFrame into error class
 Key: SPARK-43234
 URL: https://issues.apache.org/jira/browse/SPARK-43234
 Project: Spark
  Issue Type: Sub-task
  Components: Connect, PySpark
Affects Versions: 3.5.0
Reporter: Haejoon Lee


Migrate ValueError from Connect DataFrame into error class



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org