[jira] [Updated] (SPARK-28870) Snapshot event log files to support incremental reading

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28870: - Description: This issue tracks the effort on snapshotting the current status of

[jira] [Updated] (SPARK-28870) Snapshot event log files to support incremental reading

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28870: - Description: This issue tracks the effort on snapshotting the current status of

[jira] [Commented] (SPARK-28870) Snapshot event log files to support incremental reading

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977128#comment-16977128 ] Jungtaek Lim commented on SPARK-28870: -- Given we have SPARK-29779, this issue is orthogonal to

[jira] [Updated] (SPARK-28870) Snapshot event log files to support incremental reading

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28870: - Parent: (was: SPARK-28594) Issue Type: New Feature (was: Sub-task) > Snapshot

[jira] [Resolved] (SPARK-29579) Guarantee compatibility of snapshot (live entities, KVstore entities)

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29579. -- Resolution: Invalid SPARK-29779 invalidates the needs of this. Closing. > Guarantee

[jira] [Commented] (SPARK-29953) File stream source cleanup options may break a file sink output

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977111#comment-16977111 ] Jungtaek Lim commented on SPARK-29953: -- [~zsxwing] Just raised the PR

[jira] [Commented] (SPARK-29953) File stream source cleanup options may break a file sink output

2019-11-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977019#comment-16977019 ] Jungtaek Lim commented on SPARK-29953: -- Thanks for pinging me, [~zsxwing]. Totally makes sense.

[jira] [Commented] (SPARK-29929) Allow V2 Datasources to require a data distribution

2019-11-17 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976294#comment-16976294 ] Jungtaek Lim commented on SPARK-29929: -- Possibly duplicated with SPARK-23889 , though no one is

[jira] [Resolved] (SPARK-29581) Enable cleanup old event log files

2019-11-17 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29581. -- Resolution: Invalid We took the different approach: see SPARK-29779 > Enable cleanup old

[jira] [Updated] (SPARK-29779) Compact old event log files and clean up

2019-11-06 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29779: - Parent: SPARK-28594 Issue Type: Sub-task (was: Task) > Compact old event log files and

[jira] [Created] (SPARK-29779) Compact old event log files and clean up

2019-11-06 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29779: Summary: Compact old event log files and clean up Key: SPARK-29779 URL: https://issues.apache.org/jira/browse/SPARK-29779 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-29755) ClassCastException occurs when reading events from SHS

2019-11-05 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967354#comment-16967354 ] Jungtaek Lim commented on SPARK-29755: -- Working on the patch. > ClassCastException occurs when

[jira] [Created] (SPARK-29755) ClassCastException occurs when reading events from SHS

2019-11-05 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29755: Summary: ClassCastException occurs when reading events from SHS Key: SPARK-29755 URL: https://issues.apache.org/jira/browse/SPARK-29755 Project: Spark Issue

[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set

2019-10-31 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963737#comment-16963737 ] Jungtaek Lim commented on SPARK-29604: -- I've manually ran the test suite locally (single run) and

[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set

2019-10-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963476#comment-16963476 ] Jungtaek Lim commented on SPARK-29604: -- [~dongjoon] Do we have any annotation/trait to "isolate"

[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set

2019-10-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963000#comment-16963000 ] Jungtaek Lim commented on SPARK-29604: -- I think it doesn't apply to branch-2.3 as the root issue is

[jira] [Created] (SPARK-29642) ContinuousMemoryStream throws error on String type

2019-10-29 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29642: Summary: ContinuousMemoryStream throws error on String type Key: SPARK-29642 URL: https://issues.apache.org/jira/browse/SPARK-29642 Project: Spark Issue

[jira] [Created] (SPARK-29635) Deduplicate test suites between Kafka micro-batch sink and Kafka continuous sink

2019-10-29 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29635: Summary: Deduplicate test suites between Kafka micro-batch sink and Kafka continuous sink Key: SPARK-29635 URL: https://issues.apache.org/jira/browse/SPARK-29635

[jira] [Commented] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set

2019-10-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959711#comment-16959711 ] Jungtaek Lim commented on SPARK-29604: -- I've figured out the root cause and have a patch. Will

[jira] [Created] (SPARK-29604) SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set

2019-10-25 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29604: Summary: SessionState is initialized with isolated classloader for Hive if spark.sql.hive.metastore.jars is being set Key: SPARK-29604 URL:

[jira] [Commented] (SPARK-28594) Allow event logs for running streaming apps to be rolled over.

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958412#comment-16958412 ] Jungtaek Lim commented on SPARK-28594: -- Please note that SPARK-29579 and SPARK-29581 could be moved

[jira] [Created] (SPARK-29581) Enable cleanup old event log files

2019-10-23 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29581: Summary: Enable cleanup old event log files Key: SPARK-29581 URL: https://issues.apache.org/jira/browse/SPARK-29581 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-29579) Guarantee compatibility of snapshot (live entities, KVstore entities)

2019-10-23 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29579: Summary: Guarantee compatibility of snapshot (live entities, KVstore entities) Key: SPARK-29579 URL: https://issues.apache.org/jira/browse/SPARK-29579 Project: Spark

[jira] [Updated] (SPARK-29261) Support recover live entities from KVStore for (SQL)AppStatusListener

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29261: - Description: To achieve incremental reply goal in SHS, we need to support recover live

[jira] [Updated] (SPARK-29111) Support snapshot/restore of KVStore

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29111: - Description: This issue tracks the effort of supporting snapshot/restore from/to KVStore. Note

[jira] [Updated] (SPARK-28870) Snapshot event log files to support incremental reading

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28870: - Description: This issue tracks the effort on compacting event log files into snapshot and

[jira] [Updated] (SPARK-28870) Snapshot event log files to support incremental reading

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28870: - Description: This issue tracks the effort on compacting event log files into snapshot and

[jira] [Commented] (SPARK-28870) Snapshot old event log files to support compaction

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958300#comment-16958300 ] Jungtaek Lim commented on SPARK-28870: -- Discussed with Marcelo/Imran offline: I'm changing the goal

[jira] [Updated] (SPARK-28870) Snapshot event log files to support incremental reading

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28870: - Summary: Snapshot event log files to support incremental reading (was: Snapshot old event log

[jira] [Resolved] (SPARK-29538) Test failure: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple joins

2019-10-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29538. -- Resolution: Duplicate SPARK-29552 dealt with this. Will reopen if it is still flaky. > Test

[jira] [Commented] (SPARK-29538) Test failure: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple joins

2019-10-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956618#comment-16956618 ] Jungtaek Lim commented on SPARK-29538: -- Another observation: the test fails consistently when the

[jira] [Commented] (SPARK-29538) Test failure: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple joins

2019-10-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956611#comment-16956611 ] Jungtaek Lim commented on SPARK-29538: -- It doesn't even seem to be flaky. It fails consistently in

[jira] [Updated] (SPARK-29538) Test failure: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple joins

2019-10-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29538: - Summary: Test failure: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple

[jira] [Created] (SPARK-29538) Flaky test: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple joins

2019-10-21 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29538: Summary: Flaky test: org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.multiple joins Key: SPARK-29538 URL: https://issues.apache.org/jira/browse/SPARK-29538

[jira] [Commented] (SPARK-29438) Failed to get state store in stream-stream join

2019-10-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955798#comment-16955798 ] Jungtaek Lim commented on SPARK-29438: -- This would be pretty much easier to reproduce: make left

[jira] [Commented] (SPARK-29321) Possible memory leak in Spark

2019-10-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955508#comment-16955508 ] Jungtaek Lim commented on SPARK-29321: -- [~Geopap] Which process was taking slightly more memory?

[jira] [Updated] (SPARK-29321) Possible memory leak in Spark

2019-10-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29321: - Attachment: Screen Shot 2019-10-20 at 10.55.03 PM.png > Possible memory leak in Spark >

[jira] [Commented] (SPARK-29503) MapObjects doesn't copy Unsafe data when nested under Safe data

2019-10-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955090#comment-16955090 ] Jungtaek Lim commented on SPARK-29503: -- Thanks for reporting the issue in super detailed

[jira] [Created] (SPARK-29509) Deduplicate code blocks in Kafka data source

2019-10-18 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29509: Summary: Deduplicate code blocks in Kafka data source Key: SPARK-29509 URL: https://issues.apache.org/jira/browse/SPARK-29509 Project: Spark Issue Type:

[jira] [Commented] (SPARK-29438) Failed to get state store in stream-stream join

2019-10-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953422#comment-16953422 ] Jungtaek Lim commented on SPARK-29438: -- Any updates here? > Failed to get state store in

[jira] [Commented] (SPARK-29461) Spark dataframe writer does not expose metrics for JDBC writer

2019-10-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950808#comment-16950808 ] Jungtaek Lim commented on SPARK-29461: -- I'm taking a look at this. > Spark dataframe writer does

[jira] [Resolved] (SPARK-29361) Enable DataFrame with streaming source support on DSv1

2019-10-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29361. -- Resolution: Invalid Closing this, as there was some explanation that DSv2 is the first one

[jira] [Resolved] (SPARK-29129) Test failure: org.apache.spark.sql.hive.JavaDataFrameSuite (hadoop-2.7/JDK 11 combination)

2019-10-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29129. -- Resolution: Invalid We don't support hadoop-2.7 & jdk 11 profile together. Resolving this as

[jira] [Created] (SPARK-29450) [SS] In streaming aggregation, metric for output rows is not measured in append mode

2019-10-13 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29450: Summary: [SS] In streaming aggregation, metric for output rows is not measured in append mode Key: SPARK-29450 URL: https://issues.apache.org/jira/browse/SPARK-29450

[jira] [Commented] (SPARK-29426) Watermark does not take effect

2019-10-11 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949892#comment-16949892 ] Jungtaek Lim commented on SPARK-29426: -- Could you attach query listener and see the actual event

[jira] [Commented] (SPARK-29438) Failed to get state store in stream-stream join

2019-10-11 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949465#comment-16949465 ] Jungtaek Lim commented on SPARK-29438: -- Could you please link the actual code block from Github

[jira] [Comment Edited] (SPARK-29438) Failed to get state store in stream-stream join

2019-10-11 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949430#comment-16949430 ] Jungtaek Lim edited comment on SPARK-29438 at 10/11/19 1:12 PM: Could

[jira] [Commented] (SPARK-29438) Failed to get state store in stream-stream join

2019-10-11 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949430#comment-16949430 ] Jungtaek Lim commented on SPARK-29438: -- Could you point out the codebase where you are referring?

[jira] [Commented] (SPARK-29139) Flaky test: org.apache.spark.SparkContextSuite.test gpu driver resource files and discovery under local-cluster mode

2019-10-06 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945591#comment-16945591 ] Jungtaek Lim commented on SPARK-29139: -- No problem! I was trying to resolve by myself and realized

[jira] [Commented] (SPARK-29139) Flaky test: org.apache.spark.SparkContextSuite.test gpu driver resource files and discovery under local-cluster mode

2019-10-06 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945482#comment-16945482 ] Jungtaek Lim commented on SPARK-29139: -- [~dongjoon] Somehow this wasn't marked as resolved when PR

[jira] [Updated] (SPARK-29361) Enable streaming source support on DSv1

2019-10-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29361: - Component/s: (was: Spark Core) SQL > Enable streaming source support on

[jira] [Updated] (SPARK-29361) Enable DataFrame with streaming source support on DSv1

2019-10-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29361: - Summary: Enable DataFrame with streaming source support on DSv1(was: Enable streaming

[jira] [Commented] (SPARK-29361) Enable streaming source support on DSv1

2019-10-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944987#comment-16944987 ] Jungtaek Lim commented on SPARK-29361: -- The plan for now is overloading below methods marked as

[jira] [Created] (SPARK-29361) Enable streaming source support on DSv1

2019-10-04 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29361: Summary: Enable streaming source support on DSv1 Key: SPARK-29361 URL: https://issues.apache.org/jira/browse/SPARK-29361 Project: Spark Issue Type:

[jira] [Issue Comment Deleted] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29322: - Comment: was deleted (was: For event log, we seem to still use "com.github.luben:zstd-jni"

[jira] [Commented] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942508#comment-16942508 ] Jungtaek Lim commented on SPARK-29322: -- For event log, we seem to still use

[jira] [Updated] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29322: - Description: While working on SPARK-28869, I've discovered the issue that reading inprogress

[jira] [Commented] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942493#comment-16942493 ] Jungtaek Lim commented on SPARK-29322: -- {quote} Since this is ZSTD, you are using `hadoop-3.2`

[jira] [Comment Edited] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942456#comment-16942456 ] Jungtaek Lim edited comment on SPARK-29322 at 10/2/19 4:17 AM: --- Just

[jira] [Issue Comment Deleted] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29322: - Comment: was deleted (was: I'll work on PR to propose removing zstd from supported compressions

[jira] [Commented] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942456#comment-16942456 ] Jungtaek Lim commented on SPARK-29322: -- Just initiated discussion on dev. mailing list to see which

[jira] [Commented] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942446#comment-16942446 ] Jungtaek Lim commented on SPARK-29322: -- FYI, thread being stuck was finished immediately when I

[jira] [Commented] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942445#comment-16942445 ] Jungtaek Lim commented on SPARK-29322: -- I'll work on PR to propose removing zstd from supported

[jira] [Updated] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29322: - Attachment: history-server-4.jstack history-server-3.jstack

[jira] [Created] (SPARK-29322) History server is stuck reading incomplete event log file compressed with zstd

2019-10-01 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29322: Summary: History server is stuck reading incomplete event log file compressed with zstd Key: SPARK-29322 URL: https://issues.apache.org/jira/browse/SPARK-29322

[jira] [Commented] (SPARK-29097) Spark driver memory exceeded the storage memory

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942332#comment-16942332 ] Jungtaek Lim commented on SPARK-29097: -- Could you check SPARK-29055 also resolves this? If then

[jira] [Commented] (SPARK-29321) Possible memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942282#comment-16942282 ] Jungtaek Lim commented on SPARK-29321: -- I'm sorry I meant your last comment not real last comment -

[jira] [Updated] (SPARK-29321) Possible memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29321: - Component/s: (was: Block Manager) > Possible memory leak in Spark >

[jira] [Commented] (SPARK-29321) CLONE - Possible memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942279#comment-16942279 ] Jungtaek Lim commented on SPARK-29321: -- [~Geopap] Appreciate if you could migrate the last comment

[jira] [Updated] (SPARK-29321) Possible memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29321: - Summary: Possible memory leak in Spark (was: CLONE - Possible memory leak in Spark) >

[jira] [Updated] (SPARK-29321) Possible memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29321: - Summary: Possible memory leak in Spark (was: CLONE - Possible memory leak in Spark) >

[jira] [Updated] (SPARK-29321) CLONE - Possible memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-29321: - Fix Version/s: (was: 2.4.5) (was: 3.0.0) > CLONE - Possible memory

[jira] [Commented] (SPARK-29055) Memory leak in Spark

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942267#comment-16942267 ] Jungtaek Lim commented on SPARK-29055: -- Looks like I can update the title and description, but then

[jira] [Comment Edited] (SPARK-28094) Multiple left joins or aggregations in one query produce incorrect results

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941710#comment-16941710 ] Jungtaek Lim edited comment on SPARK-28094 at 10/1/19 10:02 AM:

[jira] [Commented] (SPARK-28094) Multiple left joins or aggregations in one query produce incorrect results

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941710#comment-16941710 ] Jungtaek Lim commented on SPARK-28094: -- SPARK-28074 is already merged so it will be included to

[jira] [Created] (SPARK-29314) ProgressReporter.extractStateOperatorMetrics should not overwrite updated as 0 when it actually runs a batch even with no data

2019-10-01 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29314: Summary: ProgressReporter.extractStateOperatorMetrics should not overwrite updated as 0 when it actually runs a batch even with no data Key: SPARK-29314 URL:

[jira] [Commented] (SPARK-29312) Inconsistent behavior on metrics when dropping rows in state between FlatMapGroupWithState and other stateful operators

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941693#comment-16941693 ] Jungtaek Lim commented on SPARK-29312: -- Maybe it is the intention to represent users have removed

[jira] [Created] (SPARK-29312) Inconsistent behavior on metrics when dropping rows in state between FlatMapGroupWithState and other stateful operators

2019-10-01 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29312: Summary: Inconsistent behavior on metrics when dropping rows in state between FlatMapGroupWithState and other stateful operators Key: SPARK-29312 URL:

[jira] [Comment Edited] (SPARK-28094) Multiple left joins or aggregations in one query produce incorrect results

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941623#comment-16941623 ] Jungtaek Lim edited comment on SPARK-28094 at 10/1/19 8:22 AM: --- FYI: After

[jira] [Commented] (SPARK-28094) Multiple left joins or aggregations in one query produce incorrect results

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941623#comment-16941623 ] Jungtaek Lim commented on SPARK-28094: -- FYI: After SPARK-28074, Spark will log warn message for

[jira] [Updated] (SPARK-28074) [SS] Log warn message on possible correctness issue for multiple stateful operations in single query

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28074: - Issue Type: Improvement (was: Documentation) > [SS] Log warn message on possible correctness

[jira] [Updated] (SPARK-28074) [SS] Log warn message on possible correctness issue for multiple stateful operations in single query

2019-10-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-28074: - Summary: [SS] Log warn message on possible correctness issue for multiple stateful operations

[jira] [Commented] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941492#comment-16941492 ] Jungtaek Lim commented on SPARK-29301: -- [~ploya] This commit is for branch-2.4 : 

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941491#comment-16941491 ] Jungtaek Lim commented on SPARK-27648: -- [~harichandan] This commit is for branch-2.4 : 

[jira] [Commented] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941473#comment-16941473 ] Jungtaek Lim commented on SPARK-29301: -- I'll push the branch-2.4 version of code in my repo and

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941474#comment-16941474 ] Jungtaek Lim commented on SPARK-27648: -- I'll push the branch-2.4 version of code in my repo and

[jira] [Commented] (SPARK-29055) Memory leak in Spark

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941453#comment-16941453 ] Jungtaek Lim commented on SPARK-29055: -- I haven't observed increased memory usage even without the

[jira] [Comment Edited] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940692#comment-16940692 ] Jungtaek Lim edited comment on SPARK-27648 at 9/30/19 7:23 AM: ---

[jira] [Commented] (SPARK-27648) In Spark2.4 Structured Streaming:The executor storage memory increasing over time

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940692#comment-16940692 ] Jungtaek Lim commented on SPARK-27648: -- [~yy3b2007com] [~harichandan] Sorry I'm late on revisiting

[jira] [Commented] (SPARK-29055) Memory leak in Spark

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940691#comment-16940691 ] Jungtaek Lim commented on SPARK-29055: -- [~Geopap] I guess you're hitting SPARK-29301. Please take

[jira] [Commented] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2019-09-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940672#comment-16940672 ] Jungtaek Lim commented on SPARK-29301: -- Will raise a patch soon. > Removing block is not reflected

[jira] [Created] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2019-09-30 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29301: Summary: Removing block is not reflected to the driver/executor's storage memory Key: SPARK-29301 URL: https://issues.apache.org/jira/browse/SPARK-29301 Project:

[jira] [Commented] (SPARK-29222) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_parameter_convergence

2019-09-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940434#comment-16940434 ] Jungtaek Lim commented on SPARK-29222: -- Did you see test failures consistently and the commit fixed

[jira] [Commented] (SPARK-29055) Memory leak in Spark

2019-09-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940429#comment-16940429 ] Jungtaek Lim commented on SPARK-29055: -- [~Geopap] Could you take memory dump for multiple of times

[jira] [Commented] (SPARK-29055) Memory leak in Spark

2019-09-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940252#comment-16940252 ] Jungtaek Lim commented on SPARK-29055: -- [~Geopap] Could you reproduce against latest Spark 2.4.4?

[jira] [Created] (SPARK-29281) Examples in Like/RLike doesn't consider the default value of spark.sql.parser.escapedStringLiterals

2019-09-28 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-29281: Summary: Examples in Like/RLike doesn't consider the default value of spark.sql.parser.escapedStringLiterals Key: SPARK-29281 URL:

[jira] [Resolved] (SPARK-29221) Flaky test: SQLQueryTestSuite.sql (subquery/scalar-subquery/scalar-subquery-select.sql)

2019-09-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29221. -- Fix Version/s: 3.0.0 Resolution: Fixed This is resolved via 

[jira] [Commented] (SPARK-29248) Pass in number of partitions to BuildWriter

2019-09-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938060#comment-16938060 ] Jungtaek Lim commented on SPARK-29248: -- Assuming writer got the information of number of

[jira] [Commented] (SPARK-29248) Pass in number of partitions to BuildWriter

2019-09-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938041#comment-16938041 ] Jungtaek Lim commented on SPARK-29248: -- SPARK-23889 would be the correct approach to address this -

[jira] [Comment Edited] (SPARK-29239) Subquery should not cause NPE when eliminating subexpression

2019-09-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937491#comment-16937491 ] Jungtaek Lim edited comment on SPARK-29239 at 9/25/19 7:58 AM: --- Would we

<    9   10   11   12   13   14   15   16   17   18   >