[jira] [Commented] (SPARK-16188) Spark sql create a lot of small files

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350015#comment-15350015 ] Sean Owen commented on SPARK-16188: --- Generally the idea is to merge to fewer partitions

[jira] [Commented] (SPARK-16188) Spark sql create a lot of small files

2016-06-25 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350013#comment-15350013 ] cen yuhai commented on SPARK-16188: --- [~hyukjin.kwon] It is not just empty files, but a

[jira] [Updated] (SPARK-16214) make SparkPi iteration number correct

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16214: -- Assignee: Yang Hao Priority: Minor (was: Major) Description: As the denominator is n, th

[jira] [Reopened] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-16214: --- Fixing up this JIRA. It should not be resolved, etc. > make SparkPi iteration time correct > ---

[jira] [Commented] (SPARK-16211) DataFrame filter is buggy when used with "and"

2016-06-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350010#comment-15350010 ] Herman van Hovell commented on SPARK-16211: --- It would also be great if we could

[jira] [Commented] (SPARK-16217) Support SELECT INTO statement

2016-06-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350008#comment-15350008 ] Herman van Hovell commented on SPARK-16217: --- This seems like a sensible thing t

[jira] [Commented] (SPARK-16211) DataFrame filter is buggy when used with "and"

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350007#comment-15350007 ] Sean Owen commented on SPARK-16211: --- Can you try with some more recent version? I don't

[jira] [Updated] (SPARK-16217) Support SELECT INTO statement

2016-06-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-16217: -- Target Version/s: 2.1.0 > Support SELECT INTO statement > -

[jira] [Updated] (SPARK-16217) Support SELECT INTO statement

2016-06-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-16217: -- Affects Version/s: (was: 2.0.0) 2.1.0 > Support SELECT INTO

[jira] [Commented] (SPARK-16070) DataFrame/Parquet issues with primitive arrays

2016-06-25 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350005#comment-15350005 ] Kazuaki Ishizaki commented on SPARK-16070: -- I added two JIRA entries, which addr

[jira] [Updated] (SPARK-16211) DataFrame filter is buggy when used with "and"

2016-06-25 Thread Renat Bekbolatov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renat Bekbolatov updated SPARK-16211: - Summary: DataFrame filter is buggy when used with "and" (was: DataFrame filter is buggy

[jira] [Updated] (SPARK-16211) DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String

2016-06-25 Thread Renat Bekbolatov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renat Bekbolatov updated SPARK-16211: - Description: df was a result of several joins with some upstream tables having column nam

[jira] [Issue Comment Deleted] (SPARK-16211) DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String

2016-06-25 Thread Renat Bekbolatov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renat Bekbolatov updated SPARK-16211: - Comment: was deleted (was: I haven't tested this on a later version. ) > DataFrame filte

[jira] [Commented] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349992#comment-15349992 ] Apache Spark commented on SPARK-16216: -- User 'HyukjinKwon' has created a pull reques

[jira] [Assigned] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16216: Assignee: (was: Apache Spark) > CSV data source does not write date and timestamp corr

[jira] [Assigned] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16216: Assignee: Apache Spark > CSV data source does not write date and timestamp correctly > ---

[jira] [Created] (SPARK-16217) Support SELECT INTO statement

2016-06-25 Thread GuangFancui(ISCAS) (JIRA)
GuangFancui(ISCAS) created SPARK-16217: -- Summary: Support SELECT INTO statement Key: SPARK-16217 URL: https://issues.apache.org/jira/browse/SPARK-16217 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-16215) Reduce runtime overhead of a program that writes an primitive array in Dataframe/Dataset

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349984#comment-15349984 ] Apache Spark commented on SPARK-16215: -- User 'kiszk' has created a pull request for

[jira] [Assigned] (SPARK-16215) Reduce runtime overhead of a program that writes an primitive array in Dataframe/Dataset

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16215: Assignee: (was: Apache Spark) > Reduce runtime overhead of a program that writes an pr

[jira] [Assigned] (SPARK-16215) Reduce runtime overhead of a program that writes an primitive array in Dataframe/Dataset

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16215: Assignee: Apache Spark > Reduce runtime overhead of a program that writes an primitive arr

[jira] [Created] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-06-25 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-16216: Summary: CSV data source does not write date and timestamp correctly Key: SPARK-16216 URL: https://issues.apache.org/jira/browse/SPARK-16216 Project: Spark

[jira] [Commented] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349983#comment-15349983 ] Dongjoon Hyun commented on SPARK-16208: --- New PR is in the process of review. > Add

[jira] [Created] (SPARK-16215) Reduce runtime overhead of a program that writes an primitive array in Dataframe/Dataset

2016-06-25 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-16215: Summary: Reduce runtime overhead of a program that writes an primitive array in Dataframe/Dataset Key: SPARK-16215 URL: https://issues.apache.org/jira/browse/SPARK-16215

[jira] [Reopened] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reopened SPARK-16208: --- I made a new PR. > Add `CollapseEmptyPlan` optimizer > - > >

[jira] [Updated] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao updated SPARK-16214: - Attachment: (was: SPARK-16214.patch) > make SparkPi iteration time correct >

[jira] [Updated] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao updated SPARK-16214: - Description: As the denominator is n, the iteration time should also be n was: As the denominator is

[jira] [Updated] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao updated SPARK-16214: - Attachment: SPARK-16214.patch > make SparkPi iteration time correct > ---

[jira] [Resolved] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao resolved SPARK-16214. -- Resolution: Resolved > make SparkPi iteration time correct > --- > >

[jira] [Updated] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao updated SPARK-16214: - Attachment: SPARK-16214.patch > make SparkPi iteration time correct > ---

[jira] [Updated] (SPARK-16214) make SparkPi iteration time correct

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao updated SPARK-16214: - Summary: make SparkPi iteration time correct (was: optimize SparkPi) > make SparkPi iteration time corre

[jira] [Updated] (SPARK-16214) optimize SparkPi

2016-06-25 Thread Yang Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Hao updated SPARK-16214: - Summary: optimize SparkPi (was: calculate pi is not correct) > optimize SparkPi > > >

[jira] [Created] (SPARK-16214) calculate pi is not correct

2016-06-25 Thread Yang Hao (JIRA)
Yang Hao created SPARK-16214: Summary: calculate pi is not correct Key: SPARK-16214 URL: https://issues.apache.org/jira/browse/SPARK-16214 Project: Spark Issue Type: Improvement Compone

[jira] [Commented] (SPARK-16213) Reduce runtime overhead of a program that creates an primitive array in DataFrame

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349940#comment-15349940 ] Apache Spark commented on SPARK-16213: -- User 'kiszk' has created a pull request for

[jira] [Assigned] (SPARK-16213) Reduce runtime overhead of a program that creates an primitive array in DataFrame

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16213: Assignee: Apache Spark > Reduce runtime overhead of a program that creates an primitive ar

[jira] [Assigned] (SPARK-16213) Reduce runtime overhead of a program that creates an primitive array in DataFrame

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16213: Assignee: (was: Apache Spark) > Reduce runtime overhead of a program that creates an p

[jira] [Created] (SPARK-16213) Reduce runtime overhead of a program that creates an primitive array in DataFrame

2016-06-25 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-16213: Summary: Reduce runtime overhead of a program that creates an primitive array in DataFrame Key: SPARK-16213 URL: https://issues.apache.org/jira/browse/SPARK-16213

[jira] [Commented] (SPARK-13767) py4j.protocol.Py4JNetworkError: An error occurred while trying to connect to the Java server

2016-06-25 Thread Venkata Satish Kumar Reddy Pichipipati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349938#comment-15349938 ] Venkata Satish Kumar Reddy Pichipipati commented on SPARK-13767: --

[jira] [Assigned] (SPARK-16212) code cleanup of kafka-0-8 to match review feedback on 0-10

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16212: Assignee: Apache Spark > code cleanup of kafka-0-8 to match review feedback on 0-10 >

[jira] [Commented] (SPARK-16212) code cleanup of kafka-0-8 to match review feedback on 0-10

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349937#comment-15349937 ] Apache Spark commented on SPARK-16212: -- User 'koeninger' has created a pull request

[jira] [Assigned] (SPARK-16212) code cleanup of kafka-0-8 to match review feedback on 0-10

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16212: Assignee: (was: Apache Spark) > code cleanup of kafka-0-8 to match review feedback on

[jira] [Created] (SPARK-16212) code cleanup of kafka-0-8 to match review feedback on 0-10

2016-06-25 Thread Cody Koeninger (JIRA)
Cody Koeninger created SPARK-16212: -- Summary: code cleanup of kafka-0-8 to match review feedback on 0-10 Key: SPARK-16212 URL: https://issues.apache.org/jira/browse/SPARK-16212 Project: Spark

[jira] [Comment Edited] (SPARK-16148) TaskLocation does not allow for Executor ID's with underscores

2016-06-25 Thread Tom Magrino (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349920#comment-15349920 ] Tom Magrino edited comment on SPARK-16148 at 6/26/16 1:41 AM: -

[jira] [Commented] (SPARK-16148) TaskLocation does not allow for Executor ID's with underscores

2016-06-25 Thread Tom Magrino (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349920#comment-15349920 ] Tom Magrino commented on SPARK-16148: - For posterity, I'm adding Tejas Patil's respon

[jira] [Updated] (SPARK-16211) DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String

2016-06-25 Thread Renat Bekbolatov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renat Bekbolatov updated SPARK-16211: - Environment: CDH 5.5.0/YARN > DataFrame filter is buggy when possibly: AND clause, one of

[jira] [Updated] (SPARK-16211) DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String

2016-06-25 Thread Renat Bekbolatov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renat Bekbolatov updated SPARK-16211: - Component/s: SQL Spark Shell > DataFrame filter is buggy when possibly:

[jira] [Commented] (SPARK-16211) DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String

2016-06-25 Thread Renat Bekbolatov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349908#comment-15349908 ] Renat Bekbolatov commented on SPARK-16211: -- I haven't tested this on a later ver

[jira] [Created] (SPARK-16211) DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String

2016-06-25 Thread Renat Bekbolatov (JIRA)
Renat Bekbolatov created SPARK-16211: Summary: DataFrame filter is buggy when possibly: AND clause, one of the columns involved is of type String Key: SPARK-16211 URL: https://issues.apache.org/jira/browse/SPA

[jira] [Created] (SPARK-16210) DataFrame.drop(colName) fails if another column has a period in its name

2016-06-25 Thread Simeon Simeonov (JIRA)
Simeon Simeonov created SPARK-16210: --- Summary: DataFrame.drop(colName) fails if another column has a period in its name Key: SPARK-16210 URL: https://issues.apache.org/jira/browse/SPARK-16210 Projec

[jira] [Comment Edited] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349855#comment-15349855 ] MIN-FU YANG edited comment on SPARK-15516 at 6/25/16 11:23 PM:

[jira] [Assigned] (SPARK-16209) Convert Hive Tables to Data Source Tables for CREATE TABLE AS SELECT

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16209: Assignee: (was: Apache Spark) > Convert Hive Tables to Data Source Tables for CREATE T

[jira] [Commented] (SPARK-16209) Convert Hive Tables to Data Source Tables for CREATE TABLE AS SELECT

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349859#comment-15349859 ] Apache Spark commented on SPARK-16209: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-16209) Convert Hive Tables to Data Source Tables for CREATE TABLE AS SELECT

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16209: Assignee: Apache Spark > Convert Hive Tables to Data Source Tables for CREATE TABLE AS SEL

[jira] [Updated] (SPARK-16185) Unresolved Operator When Creating Table As Select Without Enabling Hive Support

2016-06-25 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16185: Description: When we do not turn on the Hive Support, the following query generates a confusing error mess

[jira] [Comment Edited] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349855#comment-15349855 ] MIN-FU YANG edited comment on SPARK-15516 at 6/25/16 11:07 PM:

[jira] [Created] (SPARK-16209) Convert Hive Tables to Data Source Tables for CREATE TABLE AS SELECT

2016-06-25 Thread Xiao Li (JIRA)
Xiao Li created SPARK-16209: --- Summary: Convert Hive Tables to Data Source Tables for CREATE TABLE AS SELECT Key: SPARK-16209 URL: https://issues.apache.org/jira/browse/SPARK-16209 Project: Spark I

[jira] [Commented] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349855#comment-15349855 ] MIN-FU YANG commented on SPARK-15516: - [~holdenk], I found a test case in DataTypeSui

[jira] [Updated] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-16208: -- Description: This PR adds a new logical optimizer, `CollapseEmptyPlan`, to collapse a logical

[jira] [Commented] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349827#comment-15349827 ] Apache Spark commented on SPARK-16208: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Commented] (SPARK-16183) Large Spark SQL commands cause StackOverflowError in parser when using sqlContext.sql

2016-06-25 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349758#comment-15349758 ] Dongjoon Hyun commented on SPARK-16183: --- Oh, I see what you mean. Thank you. > Lar

[jira] [Commented] (SPARK-16183) Large Spark SQL commands cause StackOverflowError in parser when using sqlContext.sql

2016-06-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349746#comment-15349746 ] Herman van Hovell commented on SPARK-16183: --- [~dongjoon] We don't have a parser

[jira] [Commented] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349745#comment-15349745 ] Dongjoon Hyun commented on SPARK-16208: --- I will revisit this issue later. > Add `C

[jira] [Closed] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-16208. - Resolution: Incomplete > Add `CollapseEmptyPlan` optimizer > - >

[jira] [Commented] (SPARK-16203) regexp_extract to return an ArrayType(StringType())

2016-06-25 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349737#comment-15349737 ] Herman van Hovell commented on SPARK-16203: --- I do agree that this is not effici

[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-25 Thread Alex Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349714#comment-15349714 ] Alex Jiang commented on SPARK-13288: Agree. For DirectStream, we don't need to create

[jira] [Resolved] (SPARK-16193) Address flaky ExternalAppendOnlyMapSuite spilling tests

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16193. --- Resolution: Fixed Fix Version/s: 2.0.1 1.6.3 Issue resolved by pull request

[jira] [Commented] (SPARK-16207) order guarantees for DataFrames

2016-06-25 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349594#comment-15349594 ] Max Moroz commented on SPARK-16207: --- Would something like this be useful in the docs? I

[jira] [Commented] (SPARK-16203) regexp_extract to return an ArrayType(StringType())

2016-06-25 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349582#comment-15349582 ] Max Moroz commented on SPARK-16203: --- Hive SQL syntax allows the return value from a fun

[jira] [Commented] (SPARK-16188) Spark sql create a lot of small files

2016-06-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349581#comment-15349581 ] Hyukjin Kwon commented on SPARK-16188: -- Is this a duplicated of SPARK-10216 maybe?

[jira] [Comment Edited] (SPARK-16188) Spark sql create a lot of small files

2016-06-25 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349581#comment-15349581 ] Hyukjin Kwon edited comment on SPARK-16188 at 6/25/16 10:04 AM: ---

[jira] [Assigned] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16208: Assignee: (was: Apache Spark) > Add `CollapseEmptyPlan` optimizer > --

[jira] [Assigned] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16208: Assignee: Apache Spark > Add `CollapseEmptyPlan` optimizer > -

[jira] [Assigned] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16208: Assignee: Apache Spark > Add `CollapseEmptyPlan` optimizer > -

[jira] [Commented] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349526#comment-15349526 ] Apache Spark commented on SPARK-16208: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Commented] (SPARK-15861) pyspark mapPartitions with none generator functions / functors

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349524#comment-15349524 ] Sean Owen commented on SPARK-15861: --- Does this resolve the problem / answer the questio

[jira] [Created] (SPARK-16208) Add `CollapseEmptyPlan` optimizer

2016-06-25 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-16208: - Summary: Add `CollapseEmptyPlan` optimizer Key: SPARK-16208 URL: https://issues.apache.org/jira/browse/SPARK-16208 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349520#comment-15349520 ] MIN-FU YANG edited comment on SPARK-15516 at 6/25/16 9:11 AM: -

[jira] [Commented] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349520#comment-15349520 ] MIN-FU YANG commented on SPARK-15516: - [~holdenk] I am wondering the banning on schem

[jira] [Updated] (SPARK-16205) dict -> StructType conversion is undocumented

2016-06-25 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Moroz updated SPARK-16205: -- Description: According to the docs, StructType is equivalent only to python list and tuple. I accident

[jira] [Commented] (SPARK-16203) regexp_extract to return an ArrayType(StringType())

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349510#comment-15349510 ] Sean Owen commented on SPARK-16203: --- I'm pretty certain this is for consistency with Hi

[jira] [Updated] (SPARK-16205) dict -> StructType conversion is undocumented

2016-06-25 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Moroz updated SPARK-16205: -- Description: According to the docs, StructType is equivalent only to python list and tuple. I accident

[jira] [Comment Edited] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347465#comment-15347465 ] MIN-FU YANG edited comment on SPARK-15516 at 6/25/16 8:35 AM: -

[jira] [Updated] (SPARK-16207) order guarantees for DataFrames

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16207: -- Priority: Minor (was: Major) Generally, things like RDD and DataFrame don't guarantee any order at all

[jira] [Comment Edited] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-25 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347465#comment-15347465 ] MIN-FU YANG edited comment on SPARK-15516 at 6/25/16 8:35 AM: -

[jira] [Updated] (SPARK-16204) Row() interface

2016-06-25 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Moroz updated SPARK-16204: -- Summary: Row() interface (was: Change Row() interface) > Row() interface > --- > >

[jira] [Updated] (SPARK-16204) Change Row() interface

2016-06-25 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Moroz updated SPARK-16204: -- Summary: Change Row() interface (was: Row() interfact) > Change Row() interface >

[jira] [Updated] (SPARK-1301) Add UI elements to collapse "Aggregated Metrics by Executor" pane on stage page

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1301: - Assignee: Alex Bozarth > Add UI elements to collapse "Aggregated Metrics by Executor" pane on stage > pag

[jira] [Created] (SPARK-16207) order guarantees for DataFrames

2016-06-25 Thread Max Moroz (JIRA)
Max Moroz created SPARK-16207: - Summary: order guarantees for DataFrames Key: SPARK-16207 URL: https://issues.apache.org/jira/browse/SPARK-16207 Project: Spark Issue Type: Documentation

[jira] [Resolved] (SPARK-1301) Add UI elements to collapse "Aggregated Metrics by Executor" pane on stage page

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1301. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13037 [https://github.com/a

[jira] [Updated] (SPARK-15958) Make initial buffer size for the Sorter configurable

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15958: -- Assignee: Sital Kedia > Make initial buffer size for the Sorter configurable >

[jira] [Resolved] (SPARK-15958) Make initial buffer size for the Sorter configurable

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15958. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13699 [https://github.co

[jira] [Resolved] (SPARK-16206) Defining our own folds using CrossValidator

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16206. --- Resolution: Not A Problem If this is more of a discussion, ask at user@. You can implement whatever y

[jira] [Updated] (SPARK-16198) Change the access level of the predict method in spark.ml.Predictor to public

2016-06-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16198: -- Priority: Minor (was: Major) > Change the access level of the predict method in spark.ml.Predictor to

[jira] [Created] (SPARK-16206) Defining our own folds using CrossValidator

2016-06-25 Thread JIRA
Danilo Bustos Pérez created SPARK-16206: --- Summary: Defining our own folds using CrossValidator Key: SPARK-16206 URL: https://issues.apache.org/jira/browse/SPARK-16206 Project: Spark Iss