[jira] [Created] (SPARK-21651) Detect MapType in Json InferSchema

2017-08-07 Thread Jochen Niebuhr (JIRA)
Jochen Niebuhr created SPARK-21651: -- Summary: Detect MapType in Json InferSchema Key: SPARK-21651 URL: https://issues.apache.org/jira/browse/SPARK-21651 Project: Spark Issue Type: Improvemen

[jira] [Updated] (SPARK-21651) Detect MapType in Json InferSchema

2017-08-07 Thread Jochen Niebuhr (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Niebuhr updated SPARK-21651: --- Description: When loading Json Files which include a map with very variable keys, the curren

[jira] [Created] (SPARK-21652) Optimizer cannot reach a fixed point on certain queries

2017-08-07 Thread Anton Okolnychyi (JIRA)
Anton Okolnychyi created SPARK-21652: Summary: Optimizer cannot reach a fixed point on certain queries Key: SPARK-21652 URL: https://issues.apache.org/jira/browse/SPARK-21652 Project: Spark

[jira] [Commented] (SPARK-21650) Insert into hive partitioned table from spark-sql taking hours to complete

2017-08-07 Thread Madhavi Vaddepalli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116187#comment-16116187 ] Madhavi Vaddepalli commented on SPARK-21650: Thank you Sean Owen. -Madhavi.

[jira] [Commented] (SPARK-21652) Optimizer cannot reach a fixed point on certain queries

2017-08-07 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116192#comment-16116192 ] Anton Okolnychyi commented on SPARK-21652: -- One option to fix this is NOT to app

[jira] [Updated] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21653: Issue Type: Umbrella (was: Improvement) > Complement SQL expression document > ---

[jira] [Created] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21653: --- Summary: Complement SQL expression document Key: SPARK-21653 URL: https://issues.apache.org/jira/browse/SPARK-21653 Project: Spark Issue Type: Improvem

[jira] [Created] (SPARK-21654) Complement predicates expression description

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-21654: --- Summary: Complement predicates expression description Key: SPARK-21654 URL: https://issues.apache.org/jira/browse/SPARK-21654 Project: Spark Issue Type

[jira] [Updated] (SPARK-21638) Warning message of RF is not accurate

2017-08-07 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Meng updated SPARK-21638: -- Description: When train RF model, there is many warning message like this: {quote}WARN RandomForest: Tr

[jira] [Updated] (SPARK-21654) Complement predicates expression description

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21654: Issue Type: Sub-task (was: Improvement) Parent: SPARK-21653 > Complement predicate

[jira] [Commented] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116224#comment-16116224 ] Sean Owen commented on SPARK-21653: --- Before continuing, can you please describe what th

[jira] [Commented] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116232#comment-16116232 ] Liang-Chi Hsieh commented on SPARK-21653: - We have {{ExpressionDescription}} for

[jira] [Commented] (SPARK-21652) Optimizer cannot reach a fixed point on certain queries

2017-08-07 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116236#comment-16116236 ] Takeshi Yamamuro commented on SPARK-21652: -- It seems the known issue; have you t

[jira] [Updated] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-21653: Description: We have {{ExpressionDescription}} for SQL expressions. The expression descrip

[jira] [Commented] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116241#comment-16116241 ] Liang-Chi Hsieh commented on SPARK-21653: - [~sowen] I made a detailed description

[jira] [Commented] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116275#comment-16116275 ] Hyukjin Kwon commented on SPARK-21653: -- BTW, sounds including SPARK-18411 and looks

[jira] [Commented] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116282#comment-16116282 ] Liang-Chi Hsieh commented on SPARK-21653: - [~hyukjin.kwon] oh, yeah, looks like i

[jira] [Resolved] (SPARK-21621) Reset numRecordsWritten after DiskBlockObjectWriter.commitAndGet called

2017-08-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21621. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue resolved by pull req

[jira] [Assigned] (SPARK-21621) Reset numRecordsWritten after DiskBlockObjectWriter.commitAndGet called

2017-08-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21621: --- Assignee: Xianyang Liu > Reset numRecordsWritten after DiskBlockObjectWriter.commitAndGet ca

[jira] [Commented] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116324#comment-16116324 ] Hyukjin Kwon commented on SPARK-21653: -- Yes, there was some discussion for adding ar

[jira] [Resolved] (SPARK-13041) Add a driver history ui link and a mesos sandbox link on the dispatcher's ui page for each driver

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13041. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18528 [https://github.co

[jira] [Assigned] (SPARK-13041) Add a driver history ui link and a mesos sandbox link on the dispatcher's ui page for each driver

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-13041: - Assignee: Stavros Kontopoulos > Add a driver history ui link and a mesos sandbox link on the dis

[jira] [Resolved] (SPARK-21623) Comments of parentStats on ml/tree/impl/DTStatsAggregator.scala is wrong

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21623. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18832 [https://github.co

[jira] [Assigned] (SPARK-21623) Comments of parentStats on ml/tree/impl/DTStatsAggregator.scala is wrong

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-21623: - Assignee: Peng Meng > Comments of parentStats on ml/tree/impl/DTStatsAggregator.scala is wrong >

[jira] [Comment Edited] (SPARK-21653) Complement SQL expression document

2017-08-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116324#comment-16116324 ] Hyukjin Kwon edited comment on SPARK-21653 at 8/7/17 11:37 AM:

[jira] [Resolved] (SPARK-21544) Test jar of some module should not install or deploy twice

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21544. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18745 [https://github.co

[jira] [Commented] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-08-07 Thread Pawel Szulc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116507#comment-16116507 ] Pawel Szulc commented on SPARK-19552: - [~srowen] can u elaborate why u think that sha

[jira] [Commented] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116510#comment-16116510 ] Sean Owen commented on SPARK-19552: --- Because it's no longer on the classpath, which is

[jira] [Commented] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-08-07 Thread Pawel Szulc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116513#comment-16116513 ] Pawel Szulc commented on SPARK-19552: - What I see is netty upgrade, not shading http

[jira] [Commented] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116517#comment-16116517 ] Sean Owen commented on SPARK-19552: --- There are two steps here: get the update working (

[jira] [Commented] (SPARK-21460) Spark dynamic allocation breaks when ListenerBus event queue runs full

2017-08-07 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116525#comment-16116525 ] Yuming Wang commented on SPARK-21460: - [~Tagar] After [SPARK-19146|https://github.com

[jira] [Created] (SPARK-21655) Kill CLI for Yarn mode

2017-08-07 Thread Jong Yoon Lee (JIRA)
Jong Yoon Lee created SPARK-21655: - Summary: Kill CLI for Yarn mode Key: SPARK-21655 URL: https://issues.apache.org/jira/browse/SPARK-21655 Project: Spark Issue Type: Improvement Co

[jira] [Commented] (SPARK-21655) Kill CLI for Yarn mode

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116616#comment-16116616 ] Sean Owen commented on SPARK-21655: --- Why not just kill the driver via YARN? > Kill CLI

[jira] [Comment Edited] (SPARK-21610) Corrupt records are not handled properly when creating a dataframe from a file

2017-08-07 Thread Jen-Ming Chung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116126#comment-16116126 ] Jen-Ming Chung edited comment on SPARK-21610 at 8/7/17 2:07 PM: ---

[jira] [Resolved] (SPARK-21647) SortMergeJoin failed when using CROSS

2017-08-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21647. - Resolution: Fixed Assignee: Xiao Li Fix Version/s: 2.3.0 2.2.1

[jira] [Commented] (SPARK-21655) Kill CLI for Yarn mode

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116859#comment-16116859 ] Thomas Graves commented on SPARK-21655: --- the yarn kill does work, but it does a kil

[jira] [Commented] (SPARK-21655) Kill CLI for Yarn mode

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116872#comment-16116872 ] Sean Owen commented on SPARK-21655: --- I haven't thought this through but does this open

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116886#comment-16116886 ] Miles Crawford commented on SPARK-18838: This seems to be the core issue for a la

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116892#comment-16116892 ] Marcelo Vanzin commented on SPARK-18838: bq. although that is guaranteed to happe

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116896#comment-16116896 ] Miles Crawford commented on SPARK-18838: We do not use dynamic allocation, and ou

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116903#comment-16116903 ] Marcelo Vanzin commented on SPARK-18838: I'd be interested in seeing logs from an

[jira] [Commented] (SPARK-20863) Add metrics/instrumentation to LiveListenerBus

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116905#comment-16116905 ] Miles Crawford commented on SPARK-20863: How can I enable and view these metrics

[jira] [Comment Edited] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116916#comment-16116916 ] Miles Crawford edited comment on SPARK-18838 at 8/7/17 5:40 PM: ---

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116916#comment-16116916 ] Miles Crawford commented on SPARK-18838: Can I get specific direction on the logs

[jira] [Comment Edited] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116916#comment-16116916 ] Miles Crawford edited comment on SPARK-18838 at 8/7/17 5:41 PM: ---

[jira] [Comment Edited] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Miles Crawford (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116916#comment-16116916 ] Miles Crawford edited comment on SPARK-18838 at 8/7/17 5:41 PM: ---

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116922#comment-16116922 ] Marcelo Vanzin commented on SPARK-18838: I'd like to see something that backs you

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-08-07 Thread Jason Dunkelberger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116930#comment-16116930 ] Jason Dunkelberger commented on SPARK-18838: Thanks for your quick outline [~

[jira] [Commented] (SPARK-21652) Optimizer cannot reach a fixed point on certain queries

2017-08-07 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116936#comment-16116936 ] Anton Okolnychyi commented on SPARK-21652: -- Yes, disabling the constraint propag

[jira] [Created] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Jong Yoon Lee (JIRA)
Jong Yoon Lee created SPARK-21656: - Summary: spark dynamic allocation should not idle timeout executors when tasks still to run Key: SPARK-21656 URL: https://issues.apache.org/jira/browse/SPARK-21656

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116975#comment-16116975 ] Sean Owen commented on SPARK-21656: --- I don't see how an executor would be idle if there

[jira] [Created] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21657: - Summary: Spark has exponential time complexity to explode(array of structs) Key: SPARK-21657 URL: https://issues.apache.org/jira/browse/SPARK-21657 Project:

[jira] [Updated] (SPARK-21374) Reading globbed paths from S3 into DF doesn't work if filesystem caching is disabled

2017-08-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21374: - Fix Version/s: 2.2.1 > Reading globbed paths from S3 into DF doesn't work if filesystem caching i

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Attachment: ExponentialTimeGrowth.PNG nested-data-generator-and-test.py

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Description: It can take up to half a day to explode a modest-sizes nested collection (

[jira] [Commented] (SPARK-21655) Kill CLI for Yarn mode

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116989#comment-16116989 ] Thomas Graves commented on SPARK-21655: --- The UI kill requests are acl protected. Y

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Description: It can take up to half a day to explode a modest-sizes nested collection (

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21657: -- Priority: Major (was: Critical) Issue Type: Improvement (was: Bug) (Not a bug) I doubt this is

[jira] [Commented] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117051#comment-16117051 ] Ruslan Dautkhanov commented on SPARK-21657: --- Absolutely, this is a real use cas

[jira] [Updated] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-21656: -- Priority: Major (was: Minor) > spark dynamic allocation should not idle timeout executors when

[jira] [Updated] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-21656: -- Issue Type: Bug (was: Improvement) > spark dynamic allocation should not idle timeout executor

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Labels: cache caching collections nested_types performance pyspark sparksql sql (was:

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117086#comment-16117086 ] Thomas Graves commented on SPARK-21656: --- The executor can be idle if the scheduler

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-08-07 Thread Louis Bergelson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117090#comment-16117090 ] Louis Bergelson commented on SPARK-650: --- [~srowen] Thanks for the reply and the exam

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117114#comment-16117114 ] Sean Owen commented on SPARK-650: - I can also imagine cases involving legacy code that make

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117122#comment-16117122 ] Sean Owen commented on SPARK-21656: --- Good point. In that case, what's wrong with killin

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117159#comment-16117159 ] Thomas Graves commented on SPARK-21656: --- If given more time the scheduler would hav

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117167#comment-16117167 ] Sean Owen commented on SPARK-21656: --- If the issue is "given more time" then increase th

[jira] [Resolved] (SPARK-21565) aggregate query fails with watermark on eventTime but works with watermark on timestamp column generated by current_timestamp

2017-08-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21565. -- Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > aggregate query fails

[jira] [Assigned] (SPARK-21565) aggregate query fails with watermark on eventTime but works with watermark on timestamp column generated by current_timestamp

2017-08-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-21565: Assignee: Jose Torres > aggregate query fails with watermark on eventTime but works with w

[jira] [Resolved] (SPARK-21648) Confusing assert failure in JDBC source when users misspell the option `partitionColumn`

2017-08-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21648. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > Confusing assert failure in JDBC

[jira] [Commented] (SPARK-21565) aggregate query fails with watermark on eventTime but works with watermark on timestamp column generated by current_timestamp

2017-08-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117173#comment-16117173 ] Shixiong Zhu commented on SPARK-21565: -- Resolved by https://github.com/apache/spark/

[jira] [Comment Edited] (SPARK-21652) Optimizer cannot reach a fixed point on certain queries

2017-08-07 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116936#comment-16116936 ] Anton Okolnychyi edited comment on SPARK-21652 at 8/7/17 8:06 PM: -

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117200#comment-16117200 ] Thomas Graves commented on SPARK-21656: --- why not fix the bug in dynamic allocation?

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117204#comment-16117204 ] Thomas Graves commented on SPARK-21656: --- Another option would be just to add logic

[jira] [Created] (SPARK-21658) Adds the default None for value in na.replace in PySpark to match

2017-08-07 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-21658: Summary: Adds the default None for value in na.replace in PySpark to match Key: SPARK-21658 URL: https://issues.apache.org/jira/browse/SPARK-21658 Project: Spark

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117217#comment-16117217 ] Sean Owen commented on SPARK-21656: --- I do not understand what the bug is. Configuration

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

2017-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117241#comment-16117241 ] Thomas Graves commented on SPARK-21656: --- As a said above it DOES help the applicati

[jira] [Updated] (SPARK-18535) Redact sensitive information from Spark logs and UI

2017-08-07 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-18535: --- Fix Version/s: 2.1.2 > Redact sensitive information from Spark logs and UI >

[jira] [Closed] (SPARK-21362) Add JDBCDialect for Apache Drill

2017-08-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-21362. --- Resolution: Won't Fix See my comment on github ... > Add JDBCDialect for Apache Drill > ---

[jira] [Created] (SPARK-21659) FileStreamSink checks for _spark_metadata even if path has globs

2017-08-07 Thread peay (JIRA)
peay created SPARK-21659: Summary: FileStreamSink checks for _spark_metadata even if path has globs Key: SPARK-21659 URL: https://issues.apache.org/jira/browse/SPARK-21659 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-21542) Helper functions for custom Python Persistence

2017-08-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-21542: - Assignee: Ajay Saini > Helper functions for custom Python Persistence >

[jira] [Resolved] (SPARK-21542) Helper functions for custom Python Persistence

2017-08-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-21542. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18742 [h

[jira] [Commented] (SPARK-21658) Adds the default None for value in na.replace in PySpark to match

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117718#comment-16117718 ] Liang-Chi Hsieh commented on SPARK-21658: - I will mentor a beginner to work on th

[jira] [Commented] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117780#comment-16117780 ] Liang-Chi Hsieh commented on SPARK-21631: - [~ibingoogle] I saw you did {{export N

[jira] [Comment Edited] (SPARK-21631) Building Spark with SBT unsuccessful when source code in Mllib is modified, But with MVN is ok

2017-08-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117780#comment-16117780 ] Liang-Chi Hsieh edited comment on SPARK-21631 at 8/8/17 3:10 AM: --

[jira] [Updated] (SPARK-21306) OneVsRest Conceals Columns That May Be Relevant To Underlying Classifier

2017-08-07 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-21306: Fix Version/s: 2.1.2 2.0.3 > OneVsRest Conceals Columns That May Be Relevant To

[jira] [Created] (SPARK-21660) Yarn ShuffleService failed to start when the chosen directory become read-only

2017-08-07 Thread lishuming (JIRA)
lishuming created SPARK-21660: - Summary: Yarn ShuffleService failed to start when the chosen directory become read-only Key: SPARK-21660 URL: https://issues.apache.org/jira/browse/SPARK-21660 Project: Spa

[jira] [Resolved] (SPARK-20894) Error while checkpointing to HDFS

2017-08-07 Thread Mark Grover (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover resolved SPARK-20894. - Resolution: Fixed Fix Version/s: 2.3.0 > Error while checkpointing to HDFS > -

[jira] [Created] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2017-08-07 Thread Dapeng Sun (JIRA)
Dapeng Sun created SPARK-21661: -- Summary: SparkSQL can't merge load table from Hadoop Key: SPARK-21661 URL: https://issues.apache.org/jira/browse/SPARK-21661 Project: Spark Issue Type: Improveme

[jira] [Updated] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2017-08-07 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated SPARK-21661: --- Description: Here is the original text of external table on HDFS: {noformat} Permission Owner

[jira] [Updated] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2017-08-07 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated SPARK-21661: --- Description: Here is the original text of external table on HDFS: {noformat} Permission Owner

[jira] [Updated] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2017-08-07 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated SPARK-21661: --- Description: Here is the original text of external table on HDFS: {noformat} Permission Owner

[jira] [Commented] (SPARK-21590) Structured Streaming window start time should support negative values to adjust time zone

2017-08-07 Thread Kevin Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117900#comment-16117900 ] Kevin Zhang commented on SPARK-21590: - [~brkyvz] Thanks for your advice, but I believ

[jira] [Comment Edited] (SPARK-21590) Structured Streaming window start time should support negative values to adjust time zone

2017-08-07 Thread Kevin Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117900#comment-16117900 ] Kevin Zhang edited comment on SPARK-21590 at 8/8/17 6:05 AM: -

[jira] [Comment Edited] (SPARK-21590) Structured Streaming window start time should support negative values to adjust time zone

2017-08-07 Thread Kevin Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117900#comment-16117900 ] Kevin Zhang edited comment on SPARK-21590 at 8/8/17 6:15 AM: -

[jira] [Created] (SPARK-21662) modify the appname to [SparkSQL::localHostName] instead of [SparkSQL::lP]

2017-08-07 Thread liuzhaokun (JIRA)
liuzhaokun created SPARK-21662: -- Summary: modify the appname to [SparkSQL::localHostName] instead of [SparkSQL::lP] Key: SPARK-21662 URL: https://issues.apache.org/jira/browse/SPARK-21662 Project: Spark