[jira] [Issue Comment Deleted] (SPARK-17604) Support purging aged file entry for FileStreamSource metadata log

2020-06-08 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-17604: - Comment: was deleted (was: The issue is even reported from user group, refer here: https://lis

[jira] [Commented] (SPARK-28594) Allow event logs for running streaming apps to be rolled over

2020-06-07 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127788#comment-17127788 ] Jungtaek Lim commented on SPARK-28594: -- Unfortunately that is most probably the gua

[jira] [Commented] (SPARK-28594) Allow event logs for running streaming apps to be rolled over

2020-06-07 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127776#comment-17127776 ] Jungtaek Lim commented on SPARK-28594: -- Actually it has been an issue with almost a

[jira] [Commented] (SPARK-31812) Spark to support the auto cancelation of delegation token when an Application completes

2020-05-31 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120749#comment-17120749 ] Jungtaek Lim commented on SPARK-31812: -- [~kamrul] In general Spark project doesn't

[jira] [Commented] (SPARK-31764) JsonProtocol doesn't write RDDInfo#isBarrier

2020-05-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118244#comment-17118244 ] Jungtaek Lim commented on SPARK-31764: -- Thanks for confirming. :) > JsonProtocol d

[jira] [Commented] (SPARK-31764) JsonProtocol doesn't write RDDInfo#isBarrier

2020-05-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118201#comment-17118201 ] Jungtaek Lim commented on SPARK-31764: -- For me this looks to be a bug - the descrip

[jira] [Commented] (SPARK-31841) Dataset.repartition leverage adaptive execution

2020-05-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118114#comment-17118114 ] Jungtaek Lim commented on SPARK-31841: -- It sounds like a question/feature request w

[jira] [Commented] (SPARK-26646) Flaky test: pyspark.mllib.tests.test_streaming_algorithms StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

2020-05-26 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117400#comment-17117400 ] Jungtaek Lim commented on SPARK-26646: -- Still happening. https://amplab.cs.berkele

[jira] [Commented] (SPARK-29137) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_train_prediction

2020-05-26 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117399#comment-17117399 ] Jungtaek Lim commented on SPARK-29137: -- Still valid on latest master. https://ampl

[jira] [Created] (SPARK-31831) Flaky test: org.apache.spark.sql.hive.thriftserver.HiveSessionImplSuite.(It is not a test it is a sbt.testing.SuiteSelector)

2020-05-26 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31831: Summary: Flaky test: org.apache.spark.sql.hive.thriftserver.HiveSessionImplSuite.(It is not a test it is a sbt.testing.SuiteSelector) Key: SPARK-31831 URL: https://issues.apache.

[jira] [Commented] (SPARK-23539) Add support for Kafka headers in Structured Streaming

2020-05-26 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116653#comment-17116653 ] Jungtaek Lim commented on SPARK-23539: -- You can ignore the affect version in most c

[jira] [Commented] (SPARK-31794) Incorrect distribution with repartitionByRange and repartition column expression

2020-05-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116470#comment-17116470 ] Jungtaek Lim commented on SPARK-31794: -- http://spark.apache.org/docs/3.0.0-preview2

[jira] [Updated] (SPARK-31793) Reduce the memory usage in file scan location metadata

2020-05-24 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31793: - Fix Version/s: 3.1.0 > Reduce the memory usage in file scan location metadata >

[jira] [Commented] (SPARK-31794) Incorrect distribution with repartitionByRange and repartition column expression

2020-05-24 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115080#comment-17115080 ] Jungtaek Lim commented on SPARK-31794: -- Please read through the doc of these method

[jira] [Commented] (SPARK-31792) Introduce the structured streaming UI in the Web UI page

2020-05-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113791#comment-17113791 ] Jungtaek Lim commented on SPARK-31792: -- Fix version is for tracking the version whi

[jira] [Updated] (SPARK-31792) Introduce the structured streaming UI in the Web UI page

2020-05-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31792: - Fix Version/s: (was: 3.0.0) > Introduce the structured streaming UI in the Web UI page > ---

[jira] [Commented] (SPARK-31789) SparkSubmitOperator could not get Exit Code after log stream interrupted by k8s old resource version execption

2020-05-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113708#comment-17113708 ] Jungtaek Lim commented on SPARK-31789: -- critical / blocker are tend to be reserved

[jira] [Updated] (SPARK-31789) SparkSubmitOperator could not get Exit Code after log stream interrupted by k8s old resource version execption

2020-05-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31789: - Priority: Major (was: Blocker) > SparkSubmitOperator could not get Exit Code after log stream i

[jira] [Comment Edited] (SPARK-31761) Sql Div operator can result in incorrect output for int_min

2020-05-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113706#comment-17113706 ] Jungtaek Lim edited comment on SPARK-31761 at 5/22/20, 2:58 AM: --

[jira] [Commented] (SPARK-31761) Sql Div operator can result in incorrect output for int_min

2020-05-21 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113706#comment-17113706 ] Jungtaek Lim commented on SPARK-31761: -- Let's make sure priority is marked properly

[jira] [Commented] (SPARK-31754) Spark Structured Streaming: NullPointerException in Stream Stream join

2020-05-20 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112816#comment-17112816 ] Jungtaek Lim commented on SPARK-31754: -- I can also take a look if the input and che

[jira] [Comment Edited] (SPARK-31754) Spark Structured Streaming: NullPointerException in Stream Stream join

2020-05-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111696#comment-17111696 ] Jungtaek Lim edited comment on SPARK-31754 at 5/20/20, 2:50 AM: --

[jira] [Comment Edited] (SPARK-31754) Spark Structured Streaming: NullPointerException in Stream Stream join

2020-05-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111696#comment-17111696 ] Jungtaek Lim edited comment on SPARK-31754 at 5/20/20, 2:21 AM: --

[jira] [Commented] (SPARK-31754) Spark Structured Streaming: NullPointerException in Stream Stream join

2020-05-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111696#comment-17111696 ] Jungtaek Lim commented on SPARK-31754: -- Looks like the row itself is null which sho

[jira] [Commented] (SPARK-31754) Spark Structured Streaming: NullPointerException in Stream Stream join

2020-05-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110873#comment-17110873 ] Jungtaek Lim commented on SPARK-31754: -- [~puviarasu] Given the error comes from "ge

[jira] [Updated] (SPARK-31754) Spark Structured Streaming: NullPointerException in Stream Stream join

2020-05-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31754: - Priority: Major (was: Blocker) > Spark Structured Streaming: NullPointerException in Stream Str

[jira] [Updated] (SPARK-31257) Unify create table syntax to fix ambiguous two different CREATE TABLE syntaxes

2020-05-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31257: - Summary: Unify create table syntax to fix ambiguous two different CREATE TABLE syntaxes (was: F

[jira] [Updated] (SPARK-31257) Fix ambiguous two different CREATE TABLE syntaxes

2020-05-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31257: - Affects Version/s: (was: 3.0.0) 3.1.0 Description: There's

[jira] [Updated] (SPARK-31722) flaky streaming tests

2020-05-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31722: - Component/s: (was: Structured Streaming) DStreams > flaky streaming tests >

[jira] [Updated] (SPARK-31707) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-05-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31707: - Description: According to the latest status of discussion in the dev@ mailing list, [[DISCUSS]

[jira] [Created] (SPARK-31707) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-05-13 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31707: Summary: Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax Key: SPARK-31707 URL: https://issues.apache.org/jira/browse/SPARK-31707 Project

[jira] [Commented] (SPARK-29046) Possible NPE on SQLConf.get when SparkContext is stopping in another thread

2020-05-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106104#comment-17106104 ] Jungtaek Lim commented on SPARK-29046: -- Sorry I don't know. Also worth noting that

[jira] [Resolved] (SPARK-31698) NPE on big dataset plans

2020-05-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-31698. -- Resolution: Duplicate The weird error message and stack trace is matched with SPARK-29046 whic

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-05-03 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098723#comment-17098723 ] Jungtaek Lim commented on SPARK-26385: -- [~rajeevkumar] Yes please raise a separate

[jira] [Resolved] (SPARK-31599) Reading from S3 (Structured Streaming Bucket) Fails after Compaction

2020-04-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-31599. -- Resolution: Invalid > Reading from S3 (Structured Streaming Bucket) Fails after Compaction > -

[jira] [Commented] (SPARK-31599) Reading from S3 (Structured Streaming Bucket) Fails after Compaction

2020-04-30 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096483#comment-17096483 ] Jungtaek Lim commented on SPARK-31599: -- You understand how file stream sink and fil

[jira] [Resolved] (SPARK-30261) Should not change owner of hive table for some commands like 'alter' operation

2020-04-28 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-30261. -- Target Version/s: 2.4.3, 2.3.0 (was: 2.3.0, 2.4.3) Resolution: Duplicate > Should n

[jira] [Commented] (SPARK-31599) Reading from S3 (Structured Streaming Bucket) Fails after Compaction

2020-04-28 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094925#comment-17094925 ] Jungtaek Lim commented on SPARK-31599: -- Oh sorry I should guide to user@ mailing li

[jira] [Commented] (SPARK-31599) Reading from S3 (Structured Streaming Bucket) Fails after Compaction

2020-04-28 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094915#comment-17094915 ] Jungtaek Lim commented on SPARK-31599: -- Please post a mail thread on dev@ mailing l

[jira] [Updated] (SPARK-17604) Support purging aged file entry for FileStreamSource metadata log

2020-04-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-17604: - Affects Version/s: 3.1.0 Labels: (was: bulk-closed) Priority: Major

[jira] [Reopened] (SPARK-17604) Support purging aged file entry for FileStreamSource metadata log

2020-04-27 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reopened SPARK-17604: -- Reopening this, as end user reported this in user mailing list recently. https://lists.apache.org

[jira] [Commented] (SPARK-31559) AM starts with initial fetched tokens in any attempt

2020-04-26 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092928#comment-17092928 ] Jungtaek Lim commented on SPARK-31559: -- PR submitted: https://github.com/apache/spa

[jira] [Commented] (SPARK-31554) Flaky test suite org.apache.spark.sql.hive.thriftserver.CliSuite

2020-04-24 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092012#comment-17092012 ] Jungtaek Lim commented on SPARK-31554: -- There're two existing PRs addressing the te

[jira] [Created] (SPARK-31559) AM starts with initial fetched tokens in any attempt

2020-04-24 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31559: Summary: AM starts with initial fetched tokens in any attempt Key: SPARK-31559 URL: https://issues.apache.org/jira/browse/SPARK-31559 Project: Spark Issue Ty

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-04-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091075#comment-17091075 ] Jungtaek Lim commented on SPARK-26385: -- The symptoms are mixed up - please clarify

[jira] [Resolved] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires

2020-04-23 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-27891. -- Resolution: Cannot Reproduce SPARK-23361 is in Spark 2.4.0 and the fix is not going to be 2.3.

[jira] [Commented] (SPARK-31460) spark-sql-kafka source in spark 2.4.4 causes reading stream failure frequently

2020-04-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085346#comment-17085346 ] Jungtaek Lim commented on SPARK-31460: -- 1. Please check your app / submit phase doe

[jira] [Commented] (SPARK-26646) Flaky test: pyspark.mllib.tests.test_streaming_algorithms StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

2020-04-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083856#comment-17083856 ] Jungtaek Lim commented on SPARK-26646: -- Looks like still happening on master branch

[jira] [Commented] (SPARK-29222) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_parameter_convergence

2020-04-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083857#comment-17083857 ] Jungtaek Lim commented on SPARK-29222: -- Still happening on master (3.1.0-SNAPSHOT)

[jira] [Comment Edited] (SPARK-29137) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_train_prediction

2020-04-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083855#comment-17083855 ] Jungtaek Lim edited comment on SPARK-29137 at 4/15/20, 6:58 AM: --

[jira] [Commented] (SPARK-29137) Flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests.test_train_prediction

2020-04-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083855#comment-17083855 ] Jungtaek Lim commented on SPARK-29137: -- Still valid on latest master (3.1.0-SNAPSHO

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-04-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083849#comment-17083849 ] Jungtaek Lim commented on SPARK-26385: -- Probably you may need to share the entire l

[jira] [Commented] (SPARK-31427) Spark Structure streaming read data twice per every micro-batch.

2020-04-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082036#comment-17082036 ] Jungtaek Lim commented on SPARK-31427: -- Could you please check whether using Spark

[jira] [Comment Edited] (SPARK-31376) Non-global sort support for structured streaming

2020-04-07 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077821#comment-17077821 ] Jungtaek Lim edited comment on SPARK-31376 at 4/8/20, 4:46 AM: ---

[jira] [Commented] (SPARK-31376) Non-global sort support for structured streaming

2020-04-07 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077821#comment-17077821 ] Jungtaek Lim commented on SPARK-31376: -- Btw it would be even better if you initiate

[jira] [Commented] (SPARK-31376) Non-global sort support for structured streaming

2020-04-07 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077819#comment-17077819 ] Jungtaek Lim commented on SPARK-31376: -- I'm saying that sort is simply unavailable

[jira] [Commented] (SPARK-31376) Non-global sort support for structured streaming

2020-04-07 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077723#comment-17077723 ] Jungtaek Lim commented on SPARK-31376: -- I'll reflect the question; why do you think

[jira] [Resolved] (SPARK-30436) CREATE EXTERNAL TABLE doesn't work without STORED AS

2020-04-06 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-30436. -- Resolution: Duplicate > CREATE EXTERNAL TABLE doesn't work without STORED AS > ---

[jira] [Commented] (SPARK-31312) Transforming Hive simple UDF (using JAR) expression may incur CNFE in later evaluation

2020-04-01 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073264#comment-17073264 ] Jungtaek Lim commented on SPARK-31312: -- No, it wasn't triggered by SPARK-26560 and

[jira] [Created] (SPARK-31312) Transforming Hive simple UDF (using JAR) expression may incur CNFE in later evaluation

2020-03-31 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31312: Summary: Transforming Hive simple UDF (using JAR) expression may incur CNFE in later evaluation Key: SPARK-31312 URL: https://issues.apache.org/jira/browse/SPARK-31312

[jira] [Created] (SPARK-31257) Fix ambiguous two different CREATE TABLE syntaxes

2020-03-25 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31257: Summary: Fix ambiguous two different CREATE TABLE syntaxes Key: SPARK-31257 URL: https://issues.apache.org/jira/browse/SPARK-31257 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-18 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058333#comment-17058333 ] Jungtaek Lim edited comment on SPARK-31136 at 3/18/20, 8:14 AM: --

[jira] [Commented] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2020-03-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060004#comment-17060004 ] Jungtaek Lim commented on SPARK-29301: -- Thanks for reminding. Marked as duplicated.

[jira] [Resolved] (SPARK-27648) In Spark2.4 Structured Streamingļ¼šThe executor storage memory increasing over time

2020-03-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-27648. -- Resolution: Duplicate > In Spark2.4 Structured Streamingļ¼šThe executor storage memory increasin

[jira] [Resolved] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2020-03-16 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-29301. -- Resolution: Duplicate > Removing block is not reflected to the driver/executor's storage memor

[jira] [Commented] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-15 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059914#comment-17059914 ] Jungtaek Lim commented on SPARK-31136: -- For 2, I just initiated the discussion thre

[jira] [Commented] (SPARK-31143) Spark 2.4.4 count distinct query much slower than Spark 1.6.2 and Hive 1.2.1

2020-03-13 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058573#comment-17058573 ] Jungtaek Lim commented on SPARK-31143: -- [~shijiezhiai] Could you please leave the i

[jira] [Comment Edited] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058380#comment-17058380 ] Jungtaek Lim edited comment on SPARK-31136 at 3/13/20, 3:01 AM: --

[jira] [Commented] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058380#comment-17058380 ] Jungtaek Lim commented on SPARK-31136: -- https://github.com/apache/spark/blob/master

[jira] [Comment Edited] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058333#comment-17058333 ] Jungtaek Lim edited comment on SPARK-31136 at 3/13/20, 1:34 AM: --

[jira] [Commented] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058333#comment-17058333 ] Jungtaek Lim commented on SPARK-31136: -- This reminds me about my previous PR: [htt

[jira] [Commented] (SPARK-25987) StackOverflowError when executing many operations on a table with many columns

2020-03-12 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057713#comment-17057713 ] Jungtaek Lim commented on SPARK-25987: -- The root cause is the way of how "flowAnaly

[jira] [Created] (SPARK-31115) Lots of columns and distinct aggregation functions triggers compile exception on Janino

2020-03-11 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31115: Summary: Lots of columns and distinct aggregation functions triggers compile exception on Janino Key: SPARK-31115 URL: https://issues.apache.org/jira/browse/SPARK-31115

[jira] [Commented] (SPARK-31099) Create migration script for metastore_db

2020-03-10 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056465#comment-17056465 ] Jungtaek Lim commented on SPARK-31099: -- [~dongjoon] Could you elaborate your comme

[jira] [Created] (SPARK-31101) Upgrade Janino to 3.1.1

2020-03-09 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31101: Summary: Upgrade Janino to 3.1.1 Key: SPARK-31101 URL: https://issues.apache.org/jira/browse/SPARK-31101 Project: Spark Issue Type: Dependency upgrade

[jira] [Updated] (SPARK-31011) Failed to register signal handler for PWR

2020-03-06 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-31011: - Affects Version/s: (was: 3.0.0) 3.1.0 > Failed to register signal han

[jira] [Updated] (SPARK-30993) GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

2020-03-03 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-30993: - Fix Version/s: 2.4.6 > GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its

[jira] [Commented] (SPARK-31011) Failed to register signal handler for PWR

2020-03-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049880#comment-17049880 ] Jungtaek Lim commented on SPARK-31011: -- According to the wikipedia, SIGPWR is NOT l

[jira] [Comment Edited] (SPARK-31011) Failed to register signal handler for PWR

2020-03-02 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049880#comment-17049880 ] Jungtaek Lim edited comment on SPARK-31011 at 3/3/20 4:11 AM:

[jira] [Created] (SPARK-31014) InMemoryStore: CountingRemoveIfForEach misses to remove key from parentToChildrenMap

2020-03-02 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-31014: Summary: InMemoryStore: CountingRemoveIfForEach misses to remove key from parentToChildrenMap Key: SPARK-31014 URL: https://issues.apache.org/jira/browse/SPARK-31014

[jira] [Commented] (SPARK-30993) GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048456#comment-17048456 ] Jungtaek Lim commented on SPARK-30993: -- Just confirmed the problem persists in bran

[jira] [Updated] (SPARK-30993) GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-30993: - Affects Version/s: 2.3.4 2.4.5 > GenerateUnsafeRowJoiner corrupts the val

[jira] [Commented] (SPARK-30993) GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048321#comment-17048321 ] Jungtaek Lim commented on SPARK-30993: -- During review phase I'll check which versio

[jira] [Updated] (SPARK-30993) GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-30993: - Summary: GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has

[jira] [Commented] (SPARK-30993) GenerateUnsafeRowJoiner incorrectly modifies the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048318#comment-17048318 ] Jungtaek Lim commented on SPARK-30993: -- The reporter uses Spark 2.3.0, and validate

[jira] [Commented] (SPARK-30993) GenerateUnsafeRowJoiner incorrectly modifies the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048317#comment-17048317 ] Jungtaek Lim commented on SPARK-30993: -- Will submit a PR soon. Btw, it looks to be

[jira] [Created] (SPARK-30993) GenerateUnsafeRowJoiner incorrectly modifies the value if the datatype is UDF and its sql type has fixed length

2020-02-29 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-30993: Summary: GenerateUnsafeRowJoiner incorrectly modifies the value if the datatype is UDF and its sql type has fixed length Key: SPARK-30993 URL: https://issues.apache.org/jira/brows

[jira] [Commented] (SPARK-24295) Purge Structured streaming FileStreamSinkLog metadata compact file data.

2020-02-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045011#comment-17045011 ] Jungtaek Lim commented on SPARK-24295: -- [~iqbal_khattra] [~alfredo-gimenez-bv] Hi,

[jira] [Commented] (SPARK-29995) Structured Streaming file-sink log grow indefinitely

2020-02-25 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045009#comment-17045009 ] Jungtaek Lim commented on SPARK-29995: -- [~zhangliming] Hi, if you're open to try o

[jira] [Created] (SPARK-30946) FileStreamSourceLog/FileStreamSinkLog: leverage UnsafeRow type to serialize/deserialize entry

2020-02-25 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-30946: Summary: FileStreamSourceLog/FileStreamSinkLog: leverage UnsafeRow type to serialize/deserialize entry Key: SPARK-30946 URL: https://issues.apache.org/jira/browse/SPARK-30946

[jira] [Created] (SPARK-30943) Show "batch ID" in tool tip string for Structured Streaming UI graphs

2020-02-24 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-30943: Summary: Show "batch ID" in tool tip string for Structured Streaming UI graphs Key: SPARK-30943 URL: https://issues.apache.org/jira/browse/SPARK-30943 Project: Spark

[jira] [Created] (SPARK-30915) FileStreamSinkLog: Avoid reading the metadata log file when finding the latest batch ID

2020-02-21 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-30915: Summary: FileStreamSinkLog: Avoid reading the metadata log file when finding the latest batch ID Key: SPARK-30915 URL: https://issues.apache.org/jira/browse/SPARK-30915

[jira] [Created] (SPARK-30900) FileStreamSource: Avoid reading compact metadata log twice if the query stops from compact batch and restarts

2020-02-20 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-30900: Summary: FileStreamSource: Avoid reading compact metadata log twice if the query stops from compact batch and restarts Key: SPARK-30900 URL: https://issues.apache.org/jira/browse/

[jira] [Updated] (SPARK-30860) Different behavior between rolling and non-rolling event log

2020-02-19 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-30860: - Priority: Major (was: Minor) > Different behavior between rolling and non-rolling event log > -

[jira] [Created] (SPARK-30866) FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files

2020-02-18 Thread Jungtaek Lim (Jira)
Jungtaek Lim created SPARK-30866: Summary: FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files Key: SPARK-30866 URL: https://issues.apache.org/jira/browse/SPARK-30866

[jira] [Updated] (SPARK-30860) Different behavior between rolling and non-rolling event log

2020-02-17 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-30860: - Component/s: (was: Deploy) Spark Core > Different behavior between rolling

[jira] [Comment Edited] (SPARK-30860) Different behavior between rolling and non-rolling event log

2020-02-17 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038662#comment-17038662 ] Jungtaek Lim edited comment on SPARK-30860 at 2/17/20 10:21 PM: --

[jira] [Commented] (SPARK-30860) Different behavior between rolling and non-rolling event log

2020-02-17 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038662#comment-17038662 ] Jungtaek Lim commented on SPARK-30860: -- [~Kimahriman] Thanks for reporting! As we

[jira] [Comment Edited] (SPARK-30586) NPE in LiveRDDDistribution (AppStatusListener)

2020-02-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036788#comment-17036788 ] Jungtaek Lim edited comment on SPARK-30586 at 2/14/20 8:38 AM: ---

[jira] [Commented] (SPARK-30586) NPE in LiveRDDDistribution (AppStatusListener)

2020-02-14 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036788#comment-17036788 ] Jungtaek Lim commented on SPARK-30586: -- "onExecutorAdded" doesn't fill up hostPort

<    8   9   10   11   12   13   14   15   16   17   >