[jira] [Updated] (SPARK-39404) Unable to query _metadata in streaming if getBatch returns multiple logical nodes in the DataFrame

2022-10-22 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-39404: - Fix Version/s: 3.3.2 > Unable to query _metadata in streaming if getBatch returns multiple logic

[jira] [Resolved] (SPARK-40657) Add support for compiled classes (Java classes)

2022-10-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40657. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38286 [https://gi

[jira] [Assigned] (SPARK-40657) Add support for compiled classes (Java classes)

2022-10-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40657: Assignee: Raghu Angadi > Add support for compiled classes (Java classes) > --

[jira] [Resolved] (SPARK-39590) Python API Parity in Structure Streaming

2022-10-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39590. -- Resolution: Duplicate Closed as duplicated. > Python API Parity in Structure Streaming >

[jira] [Updated] (SPARK-40025) Project Lightspeed: Faster and Simpler Stream Processing with Apache Spark

2022-10-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40025: - Description: Project Lightspeed is an umbrella project aimed at improving a couple of key aspec

[jira] [Updated] (SPARK-40844) Flip the default value of Kafka offset fetching config

2022-10-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40844: - Labels: release-notes (was: ) > Flip the default value of Kafka offset fetching config > --

[jira] [Assigned] (SPARK-40844) Flip the default value of Kafka offset fetching config

2022-10-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40844: Assignee: Jungtaek Lim > Flip the default value of Kafka offset fetching config > ---

[jira] [Resolved] (SPARK-40844) Flip the default value of Kafka offset fetching config

2022-10-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40844. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38306 [https://gi

[jira] [Created] (SPARK-40844) Flip the default value of Kafka offset fetching config

2022-10-18 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40844: Summary: Flip the default value of Kafka offset fetching config Key: SPARK-40844 URL: https://issues.apache.org/jira/browse/SPARK-40844 Project: Spark Issue

[jira] [Resolved] (SPARK-40670) NPE in applyInPandasWithState when the input schema has "non-nullable" column(s)

2022-10-05 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40670. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38115 [https://gi

[jira] [Assigned] (SPARK-40670) NPE in applyInPandasWithState when the input schema has "non-nullable" column(s)

2022-10-05 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40670: Assignee: Jungtaek Lim > NPE in applyInPandasWithState when the input schema has "non-nul

[jira] [Created] (SPARK-40670) NPE in applyInPandasWithState when the input schema has "non-nullable" column(s)

2022-10-05 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40670: Summary: NPE in applyInPandasWithState when the input schema has "non-nullable" column(s) Key: SPARK-40670 URL: https://issues.apache.org/jira/browse/SPARK-40670 Proj

[jira] [Commented] (SPARK-40670) NPE in applyInPandasWithState when the input schema has "non-nullable" column(s)

2022-10-05 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17613257#comment-17613257 ] Jungtaek Lim commented on SPARK-40670: -- Will submit a PR soon. > NPE in applyInPan

[jira] [Resolved] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40495. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37936 [https://gi

[jira] [Assigned] (SPARK-40495) Add additional tests to StreamingSessionWindowSuite

2022-09-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40495: Assignee: Wei Liu > Add additional tests to StreamingSessionWindowSuite > ---

[jira] [Assigned] (SPARK-40509) Construct an example of applyInPandasWithState in examples directory

2022-09-28 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40509: Assignee: Chaoqin Li > Construct an example of applyInPandasWithState in examples directo

[jira] [Resolved] (SPARK-40509) Construct an example of applyInPandasWithState in examples directory

2022-09-28 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40509. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38013 [https://gi

[jira] [Resolved] (SPARK-40571) Construct a test case to verify fault-tolerance semantic with random python worker failures

2022-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40571. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38008 [https://gi

[jira] [Assigned] (SPARK-40571) Construct a test case to verify fault-tolerance semantic with random python worker failures

2022-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40571: Assignee: Jungtaek Lim > Construct a test case to verify fault-tolerance semantic with ra

[jira] [Created] (SPARK-40581) Improving testability of GroupState in applyInPandasWithState

2022-09-27 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40581: Summary: Improving testability of GroupState in applyInPandasWithState Key: SPARK-40581 URL: https://issues.apache.org/jira/browse/SPARK-40581 Project: Spark

[jira] [Resolved] (SPARK-35800) Improving testability of GroupState in streaming flatMapGroupsWithState

2022-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-35800. -- Fix Version/s: 3.2.0 Resolution: Fixed Looks like Li Zhang didn't create a ASF Jira acc

[jira] [Created] (SPARK-40571) Construct a test case to verify fault-tolerance semantic with random python worker failures

2022-09-26 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40571: Summary: Construct a test case to verify fault-tolerance semantic with random python worker failures Key: SPARK-40571 URL: https://issues.apache.org/jira/browse/SPARK-40571

[jira] [Assigned] (SPARK-40492) Perform maintenance of StateStore instances when they become inactive

2022-09-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40492: Assignee: Chaoqin Li > Perform maintenance of StateStore instances when they become inact

[jira] [Resolved] (SPARK-40492) Perform maintenance of StateStore instances when they become inactive

2022-09-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40492. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37935 [https://gi

[jira] [Commented] (SPARK-40437) Support string representation of durationMs in GroupState.setTimeoutDuration

2022-09-22 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608179#comment-17608179 ] Jungtaek Lim commented on SPARK-40437: -- It doesn't seem to be easy one to solve...

[jira] [Commented] (SPARK-40438) Support additionalDuration parameter in GroupState.setTimeoutTimestamp

2022-09-22 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608180#comment-17608180 ] Jungtaek Lim commented on SPARK-40438: -- Same comment with SPARK-40437 {quote} It d

[jira] [Resolved] (SPARK-40435) Add test suites for applyInPandasWithState in PySpark

2022-09-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40435. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37894 [https://gi

[jira] [Assigned] (SPARK-40435) Add test suites for applyInPandasWithState in PySpark

2022-09-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40435: Assignee: Jungtaek Lim > Add test suites for applyInPandasWithState in PySpark >

[jira] [Created] (SPARK-40509) Construct an example of applyInPandasWithState in examples directory

2022-09-20 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40509: Summary: Construct an example of applyInPandasWithState in examples directory Key: SPARK-40509 URL: https://issues.apache.org/jira/browse/SPARK-40509 Project: Spark

[jira] [Commented] (SPARK-40489) Spark 3.3.0 breaks with SFL4J 2.

2022-09-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606875#comment-17606875 ] Jungtaek Lim commented on SPARK-40489: -- It would be nice if you can help Spark to r

[jira] [Updated] (SPARK-40489) Spark 3.3.0 breaks with SFL4J 2.

2022-09-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40489: - Priority: Major (was: Critical) > Spark 3.3.0 breaks with SFL4J 2. > --

[jira] [Commented] (SPARK-40489) Spark 3.3.0 breaks with SFL4J 2.

2022-09-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606872#comment-17606872 ] Jungtaek Lim commented on SPARK-40489: -- https://www.slf4j.org/news.html 2022-08-20

[jira] [Assigned] (SPARK-40466) Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available

2022-09-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40466: Assignee: Huanli Wang > Improve the error message if the DSv2 source is disabled but DSv1

[jira] [Resolved] (SPARK-40466) Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available

2022-09-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40466. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37917 [https://gi

[jira] [Commented] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606433#comment-17606433 ] Jungtaek Lim commented on SPARK-40460: -- [~yaohua] Just to clarify, streaming metada

[jira] [Resolved] (SPARK-40460) Streaming metrics is zero when select _metadata

2022-09-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40460. -- Fix Version/s: 3.4.0 Assignee: Yaohua Zhao Resolution: Fixed Issue resolved vi

[jira] [Resolved] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40467. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37907 [https://gi

[jira] [Assigned] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40467: Assignee: Jungtaek Lim > Split FlatMapGroupsWithState down to multiple test suites >

[jira] [Updated] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40467: - Priority: Minor (was: Major) > Split FlatMapGroupsWithState down to multiple test suites >

[jira] [Created] (SPARK-40467) Split FlatMapGroupsWithState down to multiple test suites

2022-09-15 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40467: Summary: Split FlatMapGroupsWithState down to multiple test suites Key: SPARK-40467 URL: https://issues.apache.org/jira/browse/SPARK-40467 Project: Spark Iss

[jira] [Assigned] (SPARK-40432) Introduce GroupStateImpl and GroupStateTimeout in PySpark

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40432: Assignee: Jungtaek Lim > Introduce GroupStateImpl and GroupStateTimeout in PySpark >

[jira] [Resolved] (SPARK-40432) Introduce GroupStateImpl and GroupStateTimeout in PySpark

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40432. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37889 [https://gi

[jira] [Assigned] (SPARK-40433) Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40433: Assignee: Jungtaek Lim > Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to

[jira] [Resolved] (SPARK-40433) Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40433. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37891 [https://gi

[jira] [Created] (SPARK-40444) Support initial state in applyInPandasWithState

2022-09-15 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40444: Summary: Support initial state in applyInPandasWithState Key: SPARK-40444 URL: https://issues.apache.org/jira/browse/SPARK-40444 Project: Spark Issue Type: S

[jira] [Created] (SPARK-40443) Support applyInPandasWithState in batch query

2022-09-15 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40443: Summary: Support applyInPandasWithState in batch query Key: SPARK-40443 URL: https://issues.apache.org/jira/browse/SPARK-40443 Project: Spark Issue Type: Sub

[jira] [Updated] (SPARK-40438) Support additionalDuration parameter in GroupState.setTimeoutTimestamp

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40438: - Parent: SPARK-40431 Issue Type: Sub-task (was: Improvement) > Support additionalDuratio

[jira] [Updated] (SPARK-40437) Support string representation of durationMs in GroupState.setTimeoutDuration

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-40437: - Parent: SPARK-40431 Issue Type: Sub-task (was: Improvement) > Support string representa

[jira] [Commented] (SPARK-40438) Support additionalDuration parameter in GroupState.setTimeoutTimestamp

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605161#comment-17605161 ] Jungtaek Lim commented on SPARK-40438: -- Let's add this SPARK-40431 for better trace

[jira] [Commented] (SPARK-40437) Support string representation of durationMs in GroupState.setTimeoutDuration

2022-09-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605162#comment-17605162 ] Jungtaek Lim commented on SPARK-40437: -- Let's add this SPARK-40431 for better trace

[jira] [Created] (SPARK-40435) Add test suites for applyInPandasWithState in PySpark

2022-09-14 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40435: Summary: Add test suites for applyInPandasWithState in PySpark Key: SPARK-40435 URL: https://issues.apache.org/jira/browse/SPARK-40435 Project: Spark Issue T

[jira] [Created] (SPARK-40434) Implement applyInPandasWithState in PySpark

2022-09-14 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40434: Summary: Implement applyInPandasWithState in PySpark Key: SPARK-40434 URL: https://issues.apache.org/jira/browse/SPARK-40434 Project: Spark Issue Type: Sub-t

[jira] [Created] (SPARK-40433) Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row

2022-09-14 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40433: Summary: Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row Key: SPARK-40433 URL: https://issues.apache.org/jira/browse/SPARK-40433 Project: Spa

[jira] [Created] (SPARK-40432) Introduce GroupStateImpl and GroupStateTimeout in PySpark

2022-09-14 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40432: Summary: Introduce GroupStateImpl and GroupStateTimeout in PySpark Key: SPARK-40432 URL: https://issues.apache.org/jira/browse/SPARK-40432 Project: Spark Iss

[jira] [Commented] (SPARK-40431) Introduce "Arbitrary Stateful Processing" in Structured Streaming with Python

2022-09-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17605063#comment-17605063 ] Jungtaek Lim commented on SPARK-40431: -- This is joint effort between I and [~hyukji

[jira] [Created] (SPARK-40431) Introduce "Arbitrary Stateful Processing" in Structured Streaming with Python

2022-09-14 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40431: Summary: Introduce "Arbitrary Stateful Processing" in Structured Streaming with Python Key: SPARK-40431 URL: https://issues.apache.org/jira/browse/SPARK-40431 Project

[jira] [Resolved] (SPARK-40414) Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data

2022-09-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40414. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37864 [https://gi

[jira] [Assigned] (SPARK-40414) Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data

2022-09-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-40414: Assignee: Jungtaek Lim > Fix PythonArrowInput and PythonArrowOutput to be more generic to

[jira] [Commented] (SPARK-40414) Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data

2022-09-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603559#comment-17603559 ] Jungtaek Lim commented on SPARK-40414: -- Will submit a PR sooner. > Fix PythonArrow

[jira] [Created] (SPARK-40414) Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data

2022-09-13 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-40414: Summary: Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data Key: SPARK-40414 URL: https://issues.apache.org/jira/browse/SPARK-40414

[jira] [Resolved] (SPARK-40039) Introducing a streaming checkpoint file manager based on Hadoop's Abortable interface

2022-08-24 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-40039. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37474 [https://gi

[jira] [Commented] (SPARK-40170) StringCoding UTF8 decode slowly

2022-08-22 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582777#comment-17582777 ] Jungtaek Lim commented on SPARK-40170: -- Premature optimization is the root of all e

[jira] [Commented] (SPARK-40039) Introducing a streaming checkpoint file manager based on Hadoop's Abortable interface

2022-08-10 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578195#comment-17578195 ] Jungtaek Lim commented on SPARK-40039: -- Very interesting one to see! (Disclaimer: A

[jira] [Assigned] (SPARK-39940) Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink

2022-08-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39940: Assignee: Jungtaek Lim > Batch query cannot read the updates from streaming query if stre

[jira] [Resolved] (SPARK-39940) Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink

2022-08-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39940. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37368 [https://gi

[jira] [Created] (SPARK-39949) Principals in KafkaTestUtils should use canonical host name

2022-08-02 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39949: Summary: Principals in KafkaTestUtils should use canonical host name Key: SPARK-39949 URL: https://issues.apache.org/jira/browse/SPARK-39949 Project: Spark

[jira] [Commented] (SPARK-39949) Principals in KafkaTestUtils should use canonical host name

2022-08-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574105#comment-17574105 ] Jungtaek Lim commented on SPARK-39949: -- Will submit a fix shortly. > Principals in

[jira] [Commented] (SPARK-39940) Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink

2022-08-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573997#comment-17573997 ] Jungtaek Lim commented on SPARK-39940: -- Will submit a fix shortly. > Batch query c

[jira] [Created] (SPARK-39940) Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink

2022-08-01 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39940: Summary: Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink Key: SPARK-39940 URL: https://issues.apache.org/jira

[jira] [Resolved] (SPARK-39839) Handle special case of null variable-length Decimal with non-zero offsetAndSize in UnsafeRow structural integrity check

2022-07-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39839. -- Fix Version/s: 3.3.1 3.2.3 3.4.0 Resolution: Fixed

[jira] [Assigned] (SPARK-39839) Handle special case of null variable-length Decimal with non-zero offsetAndSize in UnsafeRow structural integrity check

2022-07-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39839: Assignee: Kris Mok > Handle special case of null variable-length Decimal with non-zero >

[jira] [Resolved] (SPARK-39834) Include the origin stats and constraints for LogicalRDD if it comes from DataFrame

2022-07-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39834. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37248 [https://gi

[jira] [Assigned] (SPARK-39834) Include the origin stats and constraints for LogicalRDD if it comes from DataFrame

2022-07-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39834: Assignee: Jungtaek Lim > Include the origin stats and constraints for LogicalRDD if it co

[jira] [Updated] (SPARK-39847) Race condition related to interruption of task threads while they are in RocksDBLoader.loadLibrary()

2022-07-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-39847: - Description: One of our workloads experienced a rare failure in `RocksDBLoader` {code:java} Caus

[jira] [Created] (SPARK-39834) Include the origin stats and constraints for LogicalRDD if it comes from DataFrame

2022-07-21 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39834: Summary: Include the origin stats and constraints for LogicalRDD if it comes from DataFrame Key: SPARK-39834 URL: https://issues.apache.org/jira/browse/SPARK-39834 Pr

[jira] [Assigned] (SPARK-39805) Deprecate Trigger.Once and Promote Trigger.AvailableNow

2022-07-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39805: Assignee: Jungtaek Lim > Deprecate Trigger.Once and Promote Trigger.AvailableNow > --

[jira] [Resolved] (SPARK-39805) Deprecate Trigger.Once and Promote Trigger.AvailableNow

2022-07-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39805. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37213 [https://gi

[jira] [Created] (SPARK-39805) Deprecate Trigger.Once and Promote Trigger.AvailableNow

2022-07-17 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39805: Summary: Deprecate Trigger.Once and Promote Trigger.AvailableNow Key: SPARK-39805 URL: https://issues.apache.org/jira/browse/SPARK-39805 Project: Spark Issue

[jira] [Assigned] (SPARK-39781) Add support for configuring max_open_files through RocksDB state store provider

2022-07-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39781: Assignee: Anish Shrigondekar > Add support for configuring max_open_files through RocksDB

[jira] [Resolved] (SPARK-39781) Add support for configuring max_open_files through RocksDB state store provider

2022-07-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39781. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37196 [https://gi

[jira] [Updated] (SPARK-39622) ParquetIOSuite fails intermittently on master branch

2022-07-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-39622: - Description: "SPARK-7837 Do not close output writer twice when commitTask() fails" in ParquetIO

[jira] [Commented] (SPARK-39622) ParquetIOSuite fails intermittently on master branch

2022-07-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566816#comment-17566816 ] Jungtaek Lim commented on SPARK-39622: -- Another failure: https://github.com/HeartSa

[jira] [Updated] (SPARK-39622) ParquetIOSuite fails intermittently on master branch

2022-07-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-39622: - Summary: ParquetIOSuite fails intermittently on master branch (was: ParquetIOSuite fails consis

[jira] [Assigned] (SPARK-39748) Include the origin logical plan for LogicalRDD if it comes from DataFrame

2022-07-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39748: Assignee: Jungtaek Lim > Include the origin logical plan for LogicalRDD if it comes from

[jira] [Resolved] (SPARK-39748) Include the origin logical plan for LogicalRDD if it comes from DataFrame

2022-07-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39748. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37161 [https://gi

[jira] [Created] (SPARK-39748) Include the origin logical plan for LogicalRDD if it comes from DataFrame

2022-07-11 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39748: Summary: Include the origin logical plan for LogicalRDD if it comes from DataFrame Key: SPARK-39748 URL: https://issues.apache.org/jira/browse/SPARK-39748 Project: Sp

[jira] [Comment Edited] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562358#comment-17562358 ] Jungtaek Lim edited comment on SPARK-39602 at 7/5/22 4:14 AM:

[jira] [Commented] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562358#comment-17562358 ] Jungtaek Lim commented on SPARK-39602: -- Why not make the number of partitions be co

[jira] [Resolved] (SPARK-39650) Streaming Deduplication should not check the schema of "value"

2022-07-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39650. -- Fix Version/s: 3.3.1 3.2.2 3.4.0 Resolution: Fixed

[jira] [Assigned] (SPARK-39650) Streaming Deduplication should not check the schema of "value"

2022-07-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39650: Assignee: Jungtaek Lim > Streaming Deduplication should not check the schema of "value" >

[jira] [Updated] (SPARK-39650) Streaming Deduplication should not check the schema of "value"

2022-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-39650: - Affects Version/s: 3.2.1 3.1.2 > Streaming Deduplication should not check

[jira] [Commented] (SPARK-39650) Streaming Deduplication should not check the schema of "value"

2022-06-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561262#comment-17561262 ] Jungtaek Lim commented on SPARK-39650: -- Will post a PR sooner. > Streaming Dedupli

[jira] [Created] (SPARK-39650) Streaming Deduplication should not check the schema of "value"

2022-06-30 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39650: Summary: Streaming Deduplication should not check the schema of "value" Key: SPARK-39650 URL: https://issues.apache.org/jira/browse/SPARK-39650 Project: Spark

[jira] [Resolved] (SPARK-39564) Expose the information of catalog table to the logical plan in streaming query

2022-06-27 Thread Jungtaek Lim (Jira)
Title: Message Title Jungtaek Lim resolved

[jira] [Assigned] (SPARK-39564) Expose the information of catalog table to the logical plan in streaming query

2022-06-27 Thread Jungtaek Lim (Jira)
Title: Message Title Jungtaek Lim assigned

[jira] [Created] (SPARK-39622) ParquetIOSuite fails consistently on master branch

2022-06-27 Thread Jungtaek Lim (Jira)
Title: Message Title Jungtaek Lim created

[jira] [Commented] (SPARK-39564) Expose the information of catalog table to the logical plan in streaming query

2022-06-22 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557770#comment-17557770 ] Jungtaek Lim commented on SPARK-39564: -- Will submit a PR soon. > Expose the inform

[jira] [Created] (SPARK-39564) Expose the information of catalog table to the logical plan in streaming query

2022-06-22 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-39564: Summary: Expose the information of catalog table to the logical plan in streaming query Key: SPARK-39564 URL: https://issues.apache.org/jira/browse/SPARK-39564 Projec

[jira] [Resolved] (SPARK-39404) Unable to query _metadata in streaming if getBatch returns multiple logical nodes in the DataFrame

2022-06-08 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-39404. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36801 [https://gi

[jira] [Assigned] (SPARK-39404) Unable to query _metadata in streaming if getBatch returns multiple logical nodes in the DataFrame

2022-06-08 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-39404: Assignee: Yaohua Zhao > Unable to query _metadata in streaming if getBatch returns multip

<    2   3   4   5   6   7   8   9   10   11   >