[jira] [Resolved] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2017-06-02 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-19732. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18164

[jira] [Resolved] (SPARK-20974) we should run REPL tests if SQL core has code changes

2017-06-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20974. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.2 2.0.3

[jira] [Assigned] (SPARK-20974) we should run REPL tests if SQL core has code changes

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20974: Assignee: Apache Spark (was: Wenchen Fan) > we should run REPL tests if SQL core has

[jira] [Assigned] (SPARK-20974) we should run REPL tests if SQL core has code changes

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20974: Assignee: Wenchen Fan (was: Apache Spark) > we should run REPL tests if SQL core has

[jira] [Commented] (SPARK-20974) we should run REPL tests if SQL core has code changes

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035614#comment-16035614 ] Apache Spark commented on SPARK-20974: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-20974) we should run REPL tests if SQL core has code changes

2017-06-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-20974: --- Summary: we should run REPL tests if SQL core has code changes Key: SPARK-20974 URL: https://issues.apache.org/jira/browse/SPARK-20974 Project: Spark Issue

[jira] [Comment Edited] (SPARK-20973) insert table fail caused by unable to fetch data definition file from remote hdfs

2017-06-02 Thread Yunjian Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035597#comment-16035597 ] Yunjian Zhang edited comment on SPARK-20973 at 6/2/17 11:06 PM: I did

[jira] [Commented] (SPARK-20973) insert table fail caused by unable to fetch data definition file from remote hdfs

2017-06-02 Thread Yunjian Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035597#comment-16035597 ] Yunjian Zhang commented on SPARK-20973: --- I did check the source code and add a patch to fix the

[jira] [Created] (SPARK-20973) insert table fail caused by unable to fetch data definition file from remote hdfs

2017-06-02 Thread Yunjian Zhang (JIRA)
Yunjian Zhang created SPARK-20973: - Summary: insert table fail caused by unable to fetch data definition file from remote hdfs Key: SPARK-20973 URL: https://issues.apache.org/jira/browse/SPARK-20973

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035525#comment-16035525 ] Marcelo Vanzin commented on SPARK-20662: bq. For multiple users in an enterprise deployment, it's

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035519#comment-16035519 ] Xuefu Zhang commented on SPARK-20662: - I can understand the counter argument here if Spark is

[jira] [Closed] (SPARK-20737) Mechanism for cleanup hooks, for structured-streaming sinks on executor shutdown.

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-20737. Resolution: Won't Fix > Mechanism for cleanup hooks, for structured-streaming sinks on

[jira] [Commented] (SPARK-17078) show estimated stats when doing explain

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035511#comment-16035511 ] Apache Spark commented on SPARK-17078: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035487#comment-16035487 ] Marcelo Vanzin commented on SPARK-20662: BTW if you really, really, really think this is a good

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035481#comment-16035481 ] Sean Owen commented on SPARK-20662: --- It's not equivalent to block the job, but why is that more

[jira] [Assigned] (SPARK-20972) rename HintInfo.isBroadcastable to forceBroadcast

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20972: Assignee: Wenchen Fan (was: Apache Spark) > rename HintInfo.isBroadcastable to

[jira] [Assigned] (SPARK-20972) rename HintInfo.isBroadcastable to forceBroadcast

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20972: Assignee: Apache Spark (was: Wenchen Fan) > rename HintInfo.isBroadcastable to

[jira] [Commented] (SPARK-20972) rename HintInfo.isBroadcastable to forceBroadcast

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035479#comment-16035479 ] Apache Spark commented on SPARK-20972: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035478#comment-16035478 ] Marcelo Vanzin commented on SPARK-20662: bq. It's probably not a good idea to let one job takes

[jira] [Created] (SPARK-20972) rename HintInfo.isBroadcastable to forceBroadcast

2017-06-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-20972: --- Summary: rename HintInfo.isBroadcastable to forceBroadcast Key: SPARK-20972 URL: https://issues.apache.org/jira/browse/SPARK-20972 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035462#comment-16035462 ] Xuefu Zhang commented on SPARK-20662: - [~lyc] I'm talking about mapreduce.job.max.map, which is the

[jira] [Updated] (SPARK-20065) Empty output files created for aggregation query in append mode

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20065: - Target Version/s: 2.3.0 > Empty output files created for aggregation query in append

[jira] [Updated] (SPARK-19903) Watermark metadata is lost when using resolved attributes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Target Version/s: 2.3.0 > Watermark metadata is lost when using resolved attributes >

[jira] [Updated] (SPARK-19903) Watermark metadata is lost when using resolved attributes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Component/s: (was: PySpark) > Watermark metadata is lost when using resolved

[jira] [Updated] (SPARK-19903) Watermark metadata is lost when using resolved attributes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Summary: Watermark metadata is lost when using resolved attributes (was: PySpark Kafka

[jira] [Updated] (SPARK-19903) PySpark Kafka streaming query ouput append mode not possible

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Description: PySpark example reads a Kafka stream. There is watermarking set when

[jira] [Commented] (SPARK-20002) Add support for unions between streaming and batch datasets

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035441#comment-16035441 ] Michael Armbrust commented on SPARK-20002: -- I'm not sure that we will ever support this. The

[jira] [Resolved] (SPARK-20147) Cloning SessionState does not clone streaming query listeners

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-20147. -- Resolution: Fixed Assignee: Kunal Khamar Fix Version/s: 2.2.0

[jira] [Updated] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20928: - Description: Given the current Source API, the minimum possible latency for any record

[jira] [Updated] (SPARK-20734) Structured Streaming spark.sql.streaming.schemaInference not handling schema changes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20734: - Issue Type: New Feature (was: Bug) > Structured Streaming

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2017-06-02 Thread Simeon H.K. Fitch (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035356#comment-16035356 ] Simeon H.K. Fitch commented on SPARK-7768: -- [~pgrandjean] Once a UDT is registered, the

[jira] [Commented] (SPARK-20782) Dataset's isCached operator

2017-06-02 Thread Jacek Laskowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035316#comment-16035316 ] Jacek Laskowski commented on SPARK-20782: - Just stumbled upon {{CatalogImpl.isCached}} that could

[jira] [Created] (SPARK-20971) Purge the metadata log for FileStreamSource

2017-06-02 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20971: Summary: Purge the metadata log for FileStreamSource Key: SPARK-20971 URL: https://issues.apache.org/jira/browse/SPARK-20971 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20952) TaskContext should be an InheritableThreadLocal

2017-06-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035302#comment-16035302 ] Shixiong Zhu commented on SPARK-20952: -- What I'm concerned about is global thread pools, such as

[jira] [Created] (SPARK-20970) Deprecate TaskMetrics._updatedBlockStatuses

2017-06-02 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-20970: - Summary: Deprecate TaskMetrics._updatedBlockStatuses Key: SPARK-20970 URL: https://issues.apache.org/jira/browse/SPARK-20970 Project: Spark Issue Type:

[jira] [Updated] (SPARK-20914) Javadoc contains code that is invalid

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20914: -- Priority: Trivial (was: Minor) It's OK if you don't see more like this just now, just open a PR for

[jira] [Updated] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20958: - Labels: release-notes (was: ) > Roll back parquet-mr 1.8.2 to parquet-1.8.1 >

[jira] [Resolved] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-20958. -- Resolution: Won't Fix Thanks everyone. Sounds like we'll just provide directions in

[jira] [Assigned] (SPARK-19236) Add createOrReplaceGlobalTempView

2017-06-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-19236: --- Assignee: Arman Yazdani (was: Xiao Li) > Add createOrReplaceGlobalTempView >

[jira] [Updated] (SPARK-19236) Add createOrReplaceGlobalTempView

2017-06-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19236: Fix Version/s: 2.2.0 > Add createOrReplaceGlobalTempView > - > >

[jira] [Assigned] (SPARK-19236) Add createOrReplaceGlobalTempView

2017-06-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-19236: --- Assignee: Xiao Li > Add createOrReplaceGlobalTempView > - > >

[jira] [Resolved] (SPARK-19236) Add createOrReplaceGlobalTempView

2017-06-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19236. - Resolution: Fixed > Add createOrReplaceGlobalTempView > - > >

[jira] [Commented] (SPARK-20952) TaskContext should be an InheritableThreadLocal

2017-06-02 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035195#comment-16035195 ] Robert Kruszewski commented on SPARK-20952: --- 2 is already happening on executors where the Task

[jira] [Commented] (SPARK-20952) TaskContext should be an InheritableThreadLocal

2017-06-02 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035171#comment-16035171 ] Andrew Ash commented on SPARK-20952: For the localProperties on SparkContext it does 2 things I can

[jira] [Commented] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035149#comment-16035149 ] Cheng Lian commented on SPARK-20958: Thanks [~rdblue]! I'm also reluctant to roll it back considering

[jira] [Resolved] (SPARK-20955) A lot of duplicated "executorId" strings in "TaskUIData"s

2017-06-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-20955. -- Resolution: Fixed Fix Version/s: 2.2.0 > A lot of duplicated "executorId" strings in

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2017-06-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035062#comment-16035062 ] Nicholas Chammas commented on SPARK-12661: -- I think we are good to resolve this provided that

[jira] [Updated] (SPARK-15352) Topology aware block replication

2017-06-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-15352: -- Fix Version/s: 2.2.0 > Topology aware block replication > > >

[jira] [Commented] (SPARK-15352) Topology aware block replication

2017-06-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035061#comment-16035061 ] Dongjoon Hyun commented on SPARK-15352: --- Thank you, [~shubhamc]! > Topology aware block

[jira] [Resolved] (SPARK-15352) Topology aware block replication

2017-06-02 Thread Shubham Chopra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shubham Chopra resolved SPARK-15352. Resolution: Fixed > Topology aware block replication > >

[jira] [Resolved] (SPARK-20946) simplify the config setting logic in SparkSession.getOrCreate

2017-06-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20946. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 18172

[jira] [Resolved] (SPARK-20967) SharedState.externalCatalog is not really lazy

2017-06-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20967. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 18187

[jira] [Commented] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035012#comment-16035012 ] Michael Armbrust commented on SPARK-19104: -- I'm about to cut RC3 of 2.2 and there is no pull

[jira] [Commented] (SPARK-20968) Support separator in Tokenizer

2017-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034993#comment-16034993 ] Nick Pentreath commented on SPARK-20968: Would you mind adding more detail here? What is the use

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034964#comment-16034964 ] Marcelo Vanzin commented on SPARK-20662: Yeah, I don't really understand this request. It doesn't

[jira] [Commented] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034961#comment-16034961 ] Dongjoon Hyun commented on SPARK-20958: --- +1 for [~rdblue]. > Roll back parquet-mr 1.8.2 to

[jira] [Commented] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034950#comment-16034950 ] Ryan Blue commented on SPARK-20958: --- I don't think it is a good idea to roll back. Spark doesn't depend

[jira] [Updated] (SPARK-19236) Add createOrReplaceGlobalTempView

2017-06-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-19236: -- Fix Version/s: 2.3.0 > Add createOrReplaceGlobalTempView > - >

[jira] [Updated] (SPARK-20969) last() aggregate function fails returning the right answer with ordered windows

2017-06-02 Thread Perrine Letellier (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Perrine Letellier updated SPARK-20969: -- Description: The column on which `orderBy` is performed is considered as another

[jira] [Updated] (SPARK-20969) last() aggregate function fails returning the right answer with ordered windows

2017-06-02 Thread Perrine Letellier (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Perrine Letellier updated SPARK-20969: -- Description: The column on which `orderBy` is performed is considered as another

[jira] [Commented] (SPARK-20960) make ColumnVector public

2017-06-02 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034845#comment-16034845 ] Dongjoon Hyun commented on SPARK-20960: --- cc [~mridulm80] > make ColumnVector public >

[jira] [Commented] (SPARK-20943) Correct BypassMergeSortShuffleWriter's comment

2017-06-02 Thread CanBin Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034746#comment-16034746 ] CanBin Zheng commented on SPARK-20943: -- [~saisai_shao] I got you. But I think it's better to change

[jira] [Commented] (SPARK-20790) ALS with implicit feedback ignores negative values

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034730#comment-16034730 ] Apache Spark commented on SPARK-20790: -- User 'davideis' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20942) The title style about field is error in the history server web ui.

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20942: - Assignee: guoxiaolongzte > The title style about field is error in the history server web ui. >

[jira] [Resolved] (SPARK-20942) The title style about field is error in the history server web ui.

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20942. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 18170

[jira] [Created] (SPARK-20969) last() aggregate function fails returning the right answer with ordered windows

2017-06-02 Thread Perrine Letellier (JIRA)
Perrine Letellier created SPARK-20969: - Summary: last() aggregate function fails returning the right answer with ordered windows Key: SPARK-20969 URL: https://issues.apache.org/jira/browse/SPARK-20969

[jira] [Updated] (SPARK-20799) Unable to infer schema for ORC/Parquet on S3N when secrets are in the URL

2017-06-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-20799: --- Environment: Hadoop 2.8.0 binaries > Unable to infer schema for ORC/Parquet on S3N when

[jira] [Updated] (SPARK-20799) Unable to infer schema for ORC/Parquet on S3N when secrets are in the URL

2017-06-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-20799: --- Summary: Unable to infer schema for ORC/Parquet on S3N when secrets are in the URL (was:

[jira] [Commented] (SPARK-20959) Add a parameter to UnsafeExternalSorter to configure filebuffersize

2017-06-02 Thread caoxuewen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034466#comment-16034466 ] caoxuewen commented on SPARK-20959: --- thanks for modify Priority > Add a parameter to

[jira] [Created] (SPARK-20968) Support separator in Tokenizer

2017-06-02 Thread darion yaphet (JIRA)
darion yaphet created SPARK-20968: - Summary: Support separator in Tokenizer Key: SPARK-20968 URL: https://issues.apache.org/jira/browse/SPARK-20968 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-20952) TaskContext should be an InheritableThreadLocal

2017-06-02 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034427#comment-16034427 ] Robert Kruszewski commented on SPARK-20952: --- You're right that this needs a bit of

[jira] [Assigned] (SPARK-20967) SharedState.externalCatalog is not really lazy

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20967: Assignee: Apache Spark (was: Wenchen Fan) > SharedState.externalCatalog is not really

[jira] [Assigned] (SPARK-20967) SharedState.externalCatalog is not really lazy

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20967: Assignee: Wenchen Fan (was: Apache Spark) > SharedState.externalCatalog is not really

[jira] [Commented] (SPARK-20967) SharedState.externalCatalog is not really lazy

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034372#comment-16034372 ] Apache Spark commented on SPARK-20967: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-20967) SharedState.externalCatalog is not really lazy

2017-06-02 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-20967: --- Summary: SharedState.externalCatalog is not really lazy Key: SPARK-20967 URL: https://issues.apache.org/jira/browse/SPARK-20967 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-20967) SharedState.externalCatalog is not really lazy

2017-06-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20967: Issue Type: Improvement (was: Bug) > SharedState.externalCatalog is not really lazy >

[jira] [Updated] (SPARK-20966) Table data is not sorted by startTime time desc, time is not formatted and redundant code in JDBC/ODBC Server page.

2017-06-02 Thread guoxiaolongzte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guoxiaolongzte updated SPARK-20966: --- Description: Table data is not sorted by startTime time desc in JDBC/ODBC Server page. Time

[jira] [Assigned] (SPARK-20966) Table data is not sorted by startTime time desc, time is not formatted and redundant code in JDBC/ODBC Server page.

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20966: Assignee: (was: Apache Spark) > Table data is not sorted by startTime time desc, time

[jira] [Assigned] (SPARK-20966) Table data is not sorted by startTime time desc, time is not formatted and redundant code in JDBC/ODBC Server page.

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20966: Assignee: Apache Spark > Table data is not sorted by startTime time desc, time is not

[jira] [Commented] (SPARK-20966) Table data is not sorted by startTime time desc, time is not formatted and redundant code in JDBC/ODBC Server page.

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034345#comment-16034345 ] Apache Spark commented on SPARK-20966: -- User 'guoxiaolongzte' has created a pull request for this

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034334#comment-16034334 ] Sean Owen commented on SPARK-20662: --- Isn't this better handled by the resource manager? for example,

[jira] [Created] (SPARK-20966) Table data is not sorted by startTime time desc, time is not formatted and redundant code in JDBC/ODBC Server page.

2017-06-02 Thread guoxiaolongzte (JIRA)
guoxiaolongzte created SPARK-20966: -- Summary: Table data is not sorted by startTime time desc, time is not formatted and redundant code in JDBC/ODBC Server page. Key: SPARK-20966 URL:

[jira] [Commented] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034310#comment-16034310 ] Cheng Lian commented on SPARK-20958: [~rdblue] I think the root cause here is we cherry-picked

[jira] [Commented] (SPARK-20943) Correct BypassMergeSortShuffleWriter's comment

2017-06-02 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034296#comment-16034296 ] Saisai Shao commented on SPARK-20943: - I think the original purpose of comment is to say

[jira] [Updated] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-20958: --- Description: We recently realized that parquet-mr 1.8.2 used by Spark 2.2.0-rc2 depends on avro

[jira] [Created] (SPARK-20965) Support PREPARE and EXECUTE statements

2017-06-02 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-20965: Summary: Support PREPARE and EXECUTE statements Key: SPARK-20965 URL: https://issues.apache.org/jira/browse/SPARK-20965 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread lyc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034282#comment-16034282 ] lyc commented on SPARK-20662: - Do you mean `mapreduce.job.running.map.limit`? The conf means `The maximum

[jira] [Comment Edited] (SPARK-20662) Block jobs that have greater than a configured number of tasks

2017-06-02 Thread lyc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034282#comment-16034282 ] lyc edited comment on SPARK-20662 at 6/2/17 7:36 AM: - Do you mean

[jira] [Created] (SPARK-20964) Make some keywords reserved along with the ANSI/SQL standard

2017-06-02 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-20964: Summary: Make some keywords reserved along with the ANSI/SQL standard Key: SPARK-20964 URL: https://issues.apache.org/jira/browse/SPARK-20964 Project: Spark

[jira] [Commented] (SPARK-20675) Support Index to skip when retrieval disk structure in CoGroupedRDD

2017-06-02 Thread lyc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034256#comment-16034256 ] lyc commented on SPARK-20675: - What do you mean `StreamBuffer`? In commit `6d05c1` (at Jun 1/17), there is

[jira] [Commented] (SPARK-20939) Do not duplicate user-defined functions while optimizing logical query plans

2017-06-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034247#comment-16034247 ] Takeshi Yamamuro commented on SPARK-20939: -- This is not a bug, so I changed the type to

[jira] [Comment Edited] (SPARK-20943) Correct BypassMergeSortShuffleWriter's comment

2017-06-02 Thread CanBin Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033988#comment-16033988 ] CanBin Zheng edited comment on SPARK-20943 at 6/2/17 7:00 AM: -- Look at these

[jira] [Updated] (SPARK-20939) Do not duplicate user-defined functions while optimizing logical query plans

2017-06-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-20939: - Issue Type: Improvement (was: Bug) > Do not duplicate user-defined functions while

[jira] [Assigned] (SPARK-20962) Support subquery column aliases in FROM clause

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20962: Assignee: Apache Spark > Support subquery column aliases in FROM clause >

[jira] [Assigned] (SPARK-20962) Support subquery column aliases in FROM clause

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20962: Assignee: (was: Apache Spark) > Support subquery column aliases in FROM clause >

[jira] [Commented] (SPARK-20962) Support subquery column aliases in FROM clause

2017-06-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034228#comment-16034228 ] Apache Spark commented on SPARK-20962: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-02 Thread Nils Grabbert (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034223#comment-16034223 ] Nils Grabbert commented on SPARK-19104: --- [~marmbrus] Why are you moving this major bug to 2.3.0? As

[jira] [Updated] (SPARK-20950) Improve Serializerbuffersize configurable

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20950: -- Priority: Trivial (was: Major) [~heary-cao] please take more care in filling these out. This isn't

[jira] [Updated] (SPARK-20959) Add a parameter to UnsafeExternalSorter to configure filebuffersize

2017-06-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20959: -- Priority: Trivial (was: Major) Sounds closely related to SPARK-20950, and I'm not clear about the use

[jira] [Updated] (SPARK-20962) Support subquery column aliases in FROM clause

2017-06-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-20962: - Description: Currently, we do not support subquery column aliases; {code} scala>

  1   2   >