[jira] [Commented] (SPARK-20365) Not so accurate classpath format for AM and Containers

2017-06-01 Thread lyc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034176#comment-16034176 ] lyc commented on SPARK-20365: - Thanks for reviewing. > Not so accurate classpath format for AM and

[jira] [Assigned] (SPARK-20961) generalize the dictionary in ColumnVector

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20961: Assignee: Apache Spark (was: Wenchen Fan) > generalize the dictionary in ColumnVector >

[jira] [Assigned] (SPARK-20961) generalize the dictionary in ColumnVector

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20961: Assignee: Wenchen Fan (was: Apache Spark) > generalize the dictionary in ColumnVector >

[jira] [Commented] (SPARK-20961) generalize the dictionary in ColumnVector

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034122#comment-16034122 ] Apache Spark commented on SPARK-20961: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20960) make ColumnVector public

2017-06-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20960: --- Assignee: (was: Wenchen Fan) > make ColumnVector public > > >

[jira] [Created] (SPARK-20961) generalize the dictionary in ColumnVector

2017-06-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-20961: --- Summary: generalize the dictionary in ColumnVector Key: SPARK-20961 URL: https://issues.apache.org/jira/browse/SPARK-20961 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-20960) make ColumnVector public

2017-06-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-20960: --- Summary: make ColumnVector public Key: SPARK-20960 URL: https://issues.apache.org/jira/browse/SPARK-20960 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-20960) make ColumnVector public

2017-06-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034118#comment-16034118 ] Wenchen Fan commented on SPARK-20960: - cc [~wesmckinn] > make ColumnVector public >

[jira] [Commented] (SPARK-20854) extend hint syntax to support any expression, not just identifiers or strings

2017-06-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034115#comment-16034115 ] Felix Cheung commented on SPARK-20854: -- seems like would be good to add support for the same in

[jira] [Comment Edited] (SPARK-20149) Audit PySpark code base for 2.6 specific work arounds

2017-06-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034090#comment-16034090 ] Hyukjin Kwon edited comment on SPARK-20149 at 6/2/17 4:32 AM: -- [~holdenk], I

[jira] [Comment Edited] (SPARK-20149) Audit PySpark code base for 2.6 specific work arounds

2017-06-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034090#comment-16034090 ] Hyukjin Kwon edited comment on SPARK-20149 at 6/2/17 3:48 AM: -- [~holdenk], I

[jira] [Commented] (SPARK-20149) Audit PySpark code base for 2.6 specific work arounds

2017-06-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034090#comment-16034090 ] Hyukjin Kwon commented on SPARK-20149: -- [~holdenk], I quickly look the Python 2.7 changes in

[jira] [Commented] (SPARK-15682) Hive ORC partition write looks for root hdfs folder for existence

2017-06-01 Thread lyc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034076#comment-16034076 ] lyc commented on SPARK-15682: - Hi, I tried this both for `orc` and `parquet`, and they both throws `path

[jira] [Updated] (SPARK-20950) Improve Serializerbuffersize configurable

2017-06-01 Thread caoxuewen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caoxuewen updated SPARK-20950: -- Component/s: (was: SQL) > Improve Serializerbuffersize configurable >

[jira] [Assigned] (SPARK-20959) Add a parameter to UnsafeExternalSorter to configure filebuffersize

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20959: Assignee: (was: Apache Spark) > Add a parameter to UnsafeExternalSorter to configure

[jira] [Assigned] (SPARK-20959) Add a parameter to UnsafeExternalSorter to configure filebuffersize

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20959: Assignee: Apache Spark > Add a parameter to UnsafeExternalSorter to configure

[jira] [Commented] (SPARK-20959) Add a parameter to UnsafeExternalSorter to configure filebuffersize

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034060#comment-16034060 ] Apache Spark commented on SPARK-20959: -- User 'heary-cao' has created a pull request for this issue:

[jira] [Created] (SPARK-20959) Add a parameter to UnsafeExternalSorter to configure filebuffersize

2017-06-01 Thread caoxuewen (JIRA)
caoxuewen created SPARK-20959: - Summary: Add a parameter to UnsafeExternalSorter to configure filebuffersize Key: SPARK-20959 URL: https://issues.apache.org/jira/browse/SPARK-20959 Project: Spark

[jira] [Updated] (SPARK-20950) Improve Serializerbuffersize configurable

2017-06-01 Thread caoxuewen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caoxuewen updated SPARK-20950: -- Description: 1.With spark.shuffle.sort.initialSerBufferSize configure SerializerBufferSize of

[jira] [Updated] (SPARK-20950) Improve Serializerbuffersize configurable

2017-06-01 Thread caoxuewen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caoxuewen updated SPARK-20950: -- Summary: Improve Serializerbuffersize configurable (was: Improve Serializerbuffersize and

[jira] [Commented] (SPARK-20950) Improve Serializerbuffersize and filebuffersize configurable

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034053#comment-16034053 ] Apache Spark commented on SPARK-20950: -- User 'heary-cao' has created a pull request for this issue:

[jira] [Commented] (SPARK-20935) A daemon thread, "BatchedWriteAheadLog Writer", left behind after terminating StreamingContext.

2017-06-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034036#comment-16034036 ] Hyukjin Kwon commented on SPARK-20935: -- Thanks for pinging me. Could we just always stop()

[jira] [Commented] (SPARK-20935) A daemon thread, "BatchedWriteAheadLog Writer", left behind after terminating StreamingContext.

2017-06-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034035#comment-16034035 ] Hyukjin Kwon commented on SPARK-20935: -- Thanks for pinging me. Could we just always stop()

[jira] [Issue Comment Deleted] (SPARK-20935) A daemon thread, "BatchedWriteAheadLog Writer", left behind after terminating StreamingContext.

2017-06-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-20935: - Comment: was deleted (was: Thanks for pinging me. Could we just always stop()

[jira] [Comment Edited] (SPARK-20943) Correct BypassMergeSortShuffleWriter's comment

2017-06-01 Thread CanBin Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033988#comment-16033988 ] CanBin Zheng edited comment on SPARK-20943 at 6/2/17 1:11 AM: -- Look at there

[jira] [Comment Edited] (SPARK-20943) Correct BypassMergeSortShuffleWriter's comment

2017-06-01 Thread CanBin Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033988#comment-16033988 ] CanBin Zheng edited comment on SPARK-20943 at 6/2/17 1:09 AM: -- Look at there

[jira] [Commented] (SPARK-20943) Correct BypassMergeSortShuffleWriter's comment

2017-06-01 Thread CanBin Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033988#comment-16033988 ] CanBin Zheng commented on SPARK-20943: -- Look at there two cases. //Has Aggregator defined @Test

[jira] [Assigned] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20958: Assignee: Apache Spark (was: Cheng Lian) > Roll back parquet-mr 1.8.2 to parquet-1.8.1 >

[jira] [Assigned] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20958: Assignee: Cheng Lian (was: Apache Spark) > Roll back parquet-mr 1.8.2 to parquet-1.8.1 >

[jira] [Commented] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033987#comment-16033987 ] Apache Spark commented on SPARK-20958: -- User 'liancheng' has created a pull request for this issue:

[jira] [Created] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-01 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-20958: -- Summary: Roll back parquet-mr 1.8.2 to parquet-1.8.1 Key: SPARK-20958 URL: https://issues.apache.org/jira/browse/SPARK-20958 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-20957) Flaky Test: o.a.s.sql.streaming.StreamingQueryManagerSuite listing

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20957: Assignee: Apache Spark (was: Shixiong Zhu) > Flaky Test:

[jira] [Assigned] (SPARK-20957) Flaky Test: o.a.s.sql.streaming.StreamingQueryManagerSuite listing

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20957: Assignee: Shixiong Zhu (was: Apache Spark) > Flaky Test:

[jira] [Commented] (SPARK-20957) Flaky Test: o.a.s.sql.streaming.StreamingQueryManagerSuite listing

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033955#comment-16033955 ] Apache Spark commented on SPARK-20957: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-20957) Flaky Test: o.a.s.sql.streaming.StreamingQueryManagerSuite listing

2017-06-01 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20957: Summary: Flaky Test: o.a.s.sql.streaming.StreamingQueryManagerSuite listing Key: SPARK-20957 URL: https://issues.apache.org/jira/browse/SPARK-20957 Project: Spark

[jira] [Resolved] (SPARK-19150) completely support using hive as data source to create tables

2017-06-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19150. - Resolution: Fixed Fix Version/s: 2.2.0 Target Version/s: (was: 2.3.0) all

[jira] [Resolved] (SPARK-17203) data source options should always be case insensitive

2017-06-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17203. - Resolution: Fixed Fix Version/s: 2.2.0 Target Version/s: (was: 2.3.0) this is

[jira] [Comment Edited] (SPARK-20520) R streaming tests failed on Windows

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033918#comment-16033918 ] Dongjoon Hyun edited comment on SPARK-20520 at 6/1/17 11:46 PM: Hi,

[jira] [Commented] (SPARK-20520) R streaming tests failed on Windows

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033918#comment-16033918 ] Dongjoon Hyun commented on SPARK-20520: --- Hi, [~felixcheung]. Is this still targeting 2.2.0? > R

[jira] [Comment Edited] (SPARK-20952) TaskContext should be an InheritableThreadLocal

2017-06-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033915#comment-16033915 ] Shixiong Zhu edited comment on SPARK-20952 at 6/1/17 11:43 PM: ---

[jira] [Commented] (SPARK-20952) TaskContext should be an InheritableThreadLocal

2017-06-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033915#comment-16033915 ] Shixiong Zhu commented on SPARK-20952: -- InheritableThreadLocal only works when creating a new

[jira] [Commented] (SPARK-20025) Driver fail over will not work, if SPARK_LOCAL* env is set.

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033912#comment-16033912 ] Dongjoon Hyun commented on SPARK-20025: --- Hi, [~scrapco...@gmail.com]. Could you adjust the target

[jira] [Commented] (SPARK-20129) JavaSparkContext should use SparkContext.getOrCreate

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033911#comment-16033911 ] Dongjoon Hyun commented on SPARK-20129: --- Hi, [~mengxr]. Is this resolved at 2.2.0? >

[jira] [Commented] (SPARK-19035) rand() function in case when cause failed

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033908#comment-16033908 ] Dongjoon Hyun commented on SPARK-19035: --- If this is not resolved at 2.2.0, shall we remove the

[jira] [Commented] (SPARK-18451) Always set -XX:+HeapDumpOnOutOfMemoryError for Spark tests

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033904#comment-16033904 ] Dongjoon Hyun commented on SPARK-18451: --- Hi, [~lian cheng]. Shall we remove the target version here

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033900#comment-16033900 ] Dongjoon Hyun commented on SPARK-17637: --- Shall we remove the target version 2.2.0 here? > Packed

[jira] [Assigned] (SPARK-20894) Error while checkpointing to HDFS

2017-06-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-20894: Assignee: Shixiong Zhu > Error while checkpointing to HDFS >

[jira] [Updated] (SPARK-20894) Error while checkpointing to HDFS

2017-06-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-20894: - Fix Version/s: 2.3.0 > Error while checkpointing to HDFS > - > >

[jira] [Commented] (SPARK-15352) Topology aware block replication

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033899#comment-16033899 ] Dongjoon Hyun commented on SPARK-15352: --- Hi, [~shubhamc]. Can we resolve this issue at 2.2.0 right

[jira] [Commented] (SPARK-20894) Error while checkpointing to HDFS

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033898#comment-16033898 ] Apache Spark commented on SPARK-20894: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-15044: -- Target Version/s: (was: 2.2.0) > spark-sql will throw "input path does not exist" exception

[jira] [Commented] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033895#comment-16033895 ] Dongjoon Hyun commented on SPARK-15044: --- Hi, All. According to SPARK-10198, that option seems to be

[jira] [Commented] (SPARK-20953) Add hash map metrics to aggregate and join

2017-06-01 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033891#comment-16033891 ] Liang-Chi Hsieh commented on SPARK-20953: - [~rxin] Yeah, thanks for pinging me. I'll look into

[jira] [Commented] (SPARK-12661) Drop Python 2.6 support in PySpark

2017-06-01 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033886#comment-16033886 ] Dongjoon Hyun commented on SPARK-12661: --- Hi, All. Is it enough to resolve this issue? Or, do we

[jira] [Commented] (SPARK-20922) Unsafe deserialization in Spark LauncherConnection

2017-06-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033877#comment-16033877 ] Apache Spark commented on SPARK-20922: -- User 'vanzin' has created a pull request for this issue:

[jira] [Updated] (SPARK-15693) Write schema definition out for file-based data sources to avoid schema inference

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15693: - Target Version/s: 2.3.0 (was: 2.2.0) > Write schema definition out for file-based data

[jira] [Updated] (SPARK-15380) Generate code that stores a float/double value in each column from ColumnarBatch when DataFrame.cache() is used

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15380: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate code that stores a float/double value

[jira] [Updated] (SPARK-19084) conditional function: field

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19084: - Target Version/s: 2.3.0 (was: 2.2.0) > conditional function: field >

[jira] [Updated] (SPARK-15691) Refactor and improve Hive support

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15691: - Target Version/s: 2.3.0 (was: 2.2.0) > Refactor and improve Hive support >

[jira] [Updated] (SPARK-14878) Support Trim characters in the string trim function

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-14878: - Target Version/s: 2.3.0 (was: 2.2.0) > Support Trim characters in the string trim

[jira] [Updated] (SPARK-16496) Add wholetext as option for reading text in SQL.

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16496: - Target Version/s: 2.3.0 (was: 2.2.0) > Add wholetext as option for reading text in SQL.

[jira] [Updated] (SPARK-19241) remove hive generated table properties if they are not useful in Spark

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19241: - Target Version/s: 2.3.0 (was: 2.2.0) > remove hive generated table properties if they

[jira] [Updated] (SPARK-16317) Add file filtering interface for FileFormat

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16317: - Target Version/s: 2.3.0 (was: 2.2.0) > Add file filtering interface for FileFormat >

[jira] [Updated] (SPARK-19027) estimate size of object buffer for object hash aggregate

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19027: - Target Version/s: 2.3.0 (was: 2.2.0) > estimate size of object buffer for object hash

[jira] [Updated] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19104: - Target Version/s: 2.3.0 (was: 2.2.0) > CompileException with Map and Case Class in

[jira] [Updated] (SPARK-18245) Improving support for bucketed table

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18245: - Target Version/s: 2.3.0 (was: 2.2.0) > Improving support for bucketed table >

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-14098: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate Java code to build CachedColumnarBatch

[jira] [Updated] (SPARK-19014) support complex aggregate buffer in HashAggregateExec

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19014: - Target Version/s: 2.3.0 (was: 2.2.0) > support complex aggregate buffer in

[jira] [Updated] (SPARK-16011) SQL metrics include duplicated attempts

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16011: - Target Version/s: 2.3.0 (was: 2.2.0) > SQL metrics include duplicated attempts >

[jira] [Updated] (SPARK-18388) Running aggregation on many columns throws SOE

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18388: - Target Version/s: 2.3.0 (was: 2.2.0) > Running aggregation on many columns throws SOE >

[jira] [Updated] (SPARK-19989) Flaky Test: org.apache.spark.sql.kafka010.KafkaSourceStressSuite

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19989: - Target Version/s: 2.3.0 (was: 2.2.0) > Flaky Test:

[jira] [Updated] (SPARK-17915) Prepare ColumnVector implementation for UnsafeData

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17915: - Target Version/s: 2.3.0 (was: 2.2.0) > Prepare ColumnVector implementation for

[jira] [Updated] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18134: - Target Version/s: 2.3.0 (was: 2.2.0) > SQL: MapType in Group BY and Joins not working >

[jira] [Updated] (SPARK-18455) General support for correlated subquery processing

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18455: - Target Version/s: 2.3.0 (was: 2.2.0) > General support for correlated subquery

[jira] [Updated] (SPARK-15690) Fast single-node (single-process) in-memory shuffle

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15690: - Target Version/s: 2.3.0 (was: 2.2.0) > Fast single-node (single-process) in-memory

[jira] [Updated] (SPARK-15689) Data source API v2

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15689: - Target Version/s: 2.3.0 (was: 2.2.0) > Data source API v2 > -- > >

[jira] [Updated] (SPARK-13184) Support minPartitions parameter for JSON and CSV datasources as options

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13184: - Target Version/s: 2.3.0 (was: 2.2.0) > Support minPartitions parameter for JSON and CSV

[jira] [Updated] (SPARK-13682) Finalize the public API for FileFormat

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13682: - Target Version/s: 2.3.0 (was: 2.2.0) > Finalize the public API for FileFormat >

[jira] [Updated] (SPARK-9221) Support IntervalType in Range Frame

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9221: Target Version/s: 2.3.0 (was: 2.2.0) > Support IntervalType in Range Frame >

[jira] [Updated] (SPARK-20319) Already quoted identifiers are getting wrapped with additional quotes

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20319: - Target Version/s: 2.3.0 (was: 2.2.0) > Already quoted identifiers are getting wrapped

[jira] [Updated] (SPARK-9576) DataFrame API improvement umbrella ticket (in Spark 2.x)

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9576: Target Version/s: 2.3.0 (was: 2.2.0) > DataFrame API improvement umbrella ticket (in Spark

[jira] [Updated] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18394: - Target Version/s: 2.3.0 (was: 2.2.0) > Executing the same query twice in a row results

[jira] [Updated] (SPARK-18891) Support for specific collection types

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18891: - Target Version/s: 2.3.0 (was: 2.2.0) > Support for specific collection types >

[jira] [Updated] (SPARK-14543) SQL/Hive insertInto has unexpected results

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-14543: - Target Version/s: 2.3.0 (was: 2.2.0) > SQL/Hive insertInto has unexpected results >

[jira] [Updated] (SPARK-17556) Executor side broadcast for broadcast joins

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17556: - Target Version/s: 2.3.0 (was: 2.2.0) > Executor side broadcast for broadcast joins >

[jira] [Updated] (SPARK-15694) Implement ScriptTransformation in sql/core

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15694: - Target Version/s: 2.3.0 (was: 2.2.0) > Implement ScriptTransformation in sql/core >

[jira] [Updated] (SPARK-16026) Cost-based Optimizer framework

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16026: - Target Version/s: 2.3.0 (was: 2.2.0) > Cost-based Optimizer framework >

[jira] [Updated] (SPARK-18543) SaveAsTable(CTAS) using overwrite could change table definition

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18543: - Target Version/s: 2.3.0 (was: 2.2.0) > SaveAsTable(CTAS) using overwrite could change

[jira] [Updated] (SPARK-18950) Report conflicting fields when merging two StructTypes.

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18950: - Target Version/s: 2.3.0 (was: 2.2.0) > Report conflicting fields when merging two

[jira] [Updated] (SPARK-15117) Generate code that get a value in each compressed column from CachedBatch when DataFrame.cache() is called

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15117: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate code that get a value in each

[jira] [Updated] (SPARK-17626) TPC-DS performance improvements using star-schema heuristics

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17626: - Target Version/s: 2.3.0 (was: 2.2.0) > TPC-DS performance improvements using

[jira] [Updated] (SPARK-15867) Use bucket files for TABLESAMPLE BUCKET

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15867: - Target Version/s: 2.3.0 (was: 2.2.0) > Use bucket files for TABLESAMPLE BUCKET >

[jira] [Updated] (SPARK-16275) Implement all the Hive fallback functions

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16275: - Target Version/s: 2.3.0 (was: 2.2.0) > Implement all the Hive fallback functions >

[jira] [Updated] (SPARK-12978) Skip unnecessary final group-by when input data already clustered with group-by keys

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12978: - Target Version/s: 2.3.0 (was: 2.2.0) > Skip unnecessary final group-by when input data

[jira] [Updated] (SPARK-16412) Generate Java code that gets an array in each column of CachedBatch when DataFrame.cache() is called

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16412: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate Java code that gets an array in each

[jira] [Updated] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4502: Target Version/s: 2.3.0 (was: 2.2.0) > Spark SQL reads unneccesary nested fields from

[jira] [Updated] (SPARK-16217) Support SELECT INTO statement

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16217: - Target Version/s: 2.3.0 (was: 2.2.0) > Support SELECT INTO statement >

[jira] [Updated] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18084: - Target Version/s: 2.3.0 (was: 2.2.0) > write.partitionBy() does not recognize nested

[jira] [Updated] (SPARK-17924) Consolidate streaming and batch write path

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17924: - Target Version/s: 2.3.0 (was: 2.2.0) > Consolidate streaming and batch write path >

[jira] [Updated] (SPARK-19150) completely support using hive as data source to create tables

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19150: - Target Version/s: 2.3.0 (was: 2.2.0) > completely support using hive as data source to

  1   2   >