[jira] [Created] (SPARK-21234) When the function returns Option[Iterator[_]] is None,then get on None will cause java.util.NoSuchElementException: None.get

2017-06-27 Thread wangjiaochun (JIRA)
wangjiaochun created SPARK-21234: Summary: When the function returns Option[Iterator[_]] is None,then get on None will cause java.util.NoSuchElementException: None.get Key: SPARK-21234 URL:

[jira] [Created] (SPARK-21233) Support pluggable offset storage

2017-06-27 Thread darion yaphet (JIRA)
darion yaphet created SPARK-21233: - Summary: Support pluggable offset storage Key: SPARK-21233 URL: https://issues.apache.org/jira/browse/SPARK-21233 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-21232) New built-in SQL function - Data_Type

2017-06-27 Thread Mario Molina (JIRA)
Mario Molina created SPARK-21232: Summary: New built-in SQL function - Data_Type Key: SPARK-21232 URL: https://issues.apache.org/jira/browse/SPARK-21232 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21232) New built-in SQL function - Data_Type

2017-06-27 Thread Mario Molina (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mario Molina updated SPARK-21232: - Description: This function returns the data type of a given column. {code:java} data_type("a")

[jira] [Commented] (SPARK-21182) Structured streaming on Spark-shell on windows

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065929#comment-16065929 ] Hyukjin Kwon commented on SPARK-21182: -- Ah, could you maybe try to explicitly specify the fully

[jira] [Commented] (SPARK-21182) Structured streaming on Spark-shell on windows

2017-06-27 Thread Vijay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065921#comment-16065921 ] Vijay commented on SPARK-21182: --- I'm still facing the same issue. Actually I have configured Hadoop on

[jira] [Resolved] (SPARK-14486) For partition table, the dag occurs oom because of too many same rdds

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-14486. -- Resolution: Invalid I am resolving this.I also agree with the opinion above. > For partition

[jira] [Resolved] (SPARK-21019) read orc when some of the columns are missing in some files

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21019. -- Resolution: Duplicate I believe this should be a duplicate of SPARK-11412. > read orc when

[jira] [Resolved] (SPARK-21053) Number overflow on agg function of Dataframe

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21053. -- Resolution: Cannot Reproduce I tried to follow what's written in this JIRA but I could not

[jira] [Commented] (SPARK-21076) R dapply doesn't return array or raw columns when array have different length

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065902#comment-16065902 ] Hyukjin Kwon commented on SPARK-21076: -- I believe this produces the similar error described above

[jira] [Commented] (SPARK-21182) Structured streaming on Spark-shell on windows

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065876#comment-16065876 ] Hyukjin Kwon commented on SPARK-21182: -- Looks I can't reproduce this on Windows at the current

[jira] [Assigned] (SPARK-19726) Faild to insert null timestamp value to mysql using spark jdbc

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19726: Assignee: Apache Spark > Faild to insert null timestamp value to mysql using spark jdbc >

[jira] [Commented] (SPARK-19726) Faild to insert null timestamp value to mysql using spark jdbc

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065869#comment-16065869 ] Apache Spark commented on SPARK-19726: -- User 'shuangshuangwang' has created a pull request for this

[jira] [Assigned] (SPARK-19726) Faild to insert null timestamp value to mysql using spark jdbc

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19726: Assignee: (was: Apache Spark) > Faild to insert null timestamp value to mysql using

[jira] [Commented] (SPARK-21222) Move elimination of Distinct clause from analyzer to optimizer

2017-06-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065791#comment-16065791 ] Gengliang Wang commented on SPARK-21222: [~srowen] thanks! I have corrected the statement. >

[jira] [Updated] (SPARK-21222) Move elimination of Distinct clause from analyzer to optimizer

2017-06-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-21222: --- Description: Move elimination of Distinct clause from analyzer to optimizer Distinct clause

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2017-06-27 Thread Han Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065790#comment-16065790 ] Han Xu commented on SPARK-10915: I'm currently traveling without access to my email. To get in touch

[jira] [Commented] (SPARK-16542) bugs about types that result an array of null when creating dataframe using python

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065789#comment-16065789 ] Apache Spark commented on SPARK-16542: -- User 'zasdfgbnm' has created a pull request for this issue:

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2017-06-27 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065787#comment-16065787 ] Erik Erlandson commented on SPARK-10915: This would be great for exposing {{TDigest}} aggregation

[jira] [Commented] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065775#comment-16065775 ] Hyukjin Kwon commented on SPARK-21227: -- In scala too: {code} val jsons = Seq( """{"city_name":

[jira] [Assigned] (SPARK-21155) Add (? running tasks) into Spark UI progress

2017-06-27 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21155: --- Assignee: Eric Vandenberg > Add (? running tasks) into Spark UI progress >

[jira] [Resolved] (SPARK-21155) Add (? running tasks) into Spark UI progress

2017-06-27 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21155. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18369

[jira] [Commented] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065769#comment-16065769 ] Hyukjin Kwon commented on SPARK-21227: -- I can reproduce this in both Python 2.7 and 3.6.0 {code} sc

[jira] [Issue Comment Deleted] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21227: - Comment: was deleted (was: I tested both as below on Python 3.6.0 and 2.7.10 as below but I

[jira] [Commented] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065768#comment-16065768 ] Hyukjin Kwon commented on SPARK-21227: -- I tested both as below on Python 3.6.0 and 2.7.10 as below

[jira] [Commented] (SPARK-21152) Use level 3 BLAS operations in LogisticAggregator

2017-06-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065694#comment-16065694 ] yuhao yang commented on SPARK-21152: This is something that we should investigate anyway. By GEMM,

[jira] [Assigned] (SPARK-21231) Conda install of packages during Jenkins testing is causing intermittent failure

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21231: Assignee: Apache Spark > Conda install of packages during Jenkins testing is causing

[jira] [Commented] (SPARK-21231) Conda install of packages during Jenkins testing is causing intermittent failure

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065621#comment-16065621 ] Apache Spark commented on SPARK-21231: -- User 'holdenk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21231) Conda install of packages during Jenkins testing is causing intermittent failure

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21231: Assignee: (was: Apache Spark) > Conda install of packages during Jenkins testing is

[jira] [Created] (SPARK-21231) Conda install of packages during Jenkins testing is causing intermittent failure

2017-06-27 Thread holdenk (JIRA)
holdenk created SPARK-21231: --- Summary: Conda install of packages during Jenkins testing is causing intermittent failure Key: SPARK-21231 URL: https://issues.apache.org/jira/browse/SPARK-21231 Project:

[jira] [Commented] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065515#comment-16065515 ] Steve Loughran commented on SPARK-21137: bq. so it is something that could be optimized in the

[jira] [Commented] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065476#comment-16065476 ] Apache Spark commented on SPARK-21137: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065475#comment-16065475 ] Sean Owen commented on SPARK-21137: --- OK, so it is something that could be optimized in the Hadoop API,

[jira] [Assigned] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21137: Assignee: (was: Apache Spark) > Spark reads many small files slowly >

[jira] [Assigned] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21137: Assignee: Apache Spark > Spark reads many small files slowly >

[jira] [Commented] (SPARK-12868) ADD JAR via sparkSQL JDBC will fail when using a HDFS URL

2017-06-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065448#comment-16065448 ] Steve Loughran commented on SPARK-12868: I think this is the case of HADOOP-14598: once the FS

[jira] [Commented] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065445#comment-16065445 ] Steve Loughran commented on SPARK-21137: Filed HADOOP-14600. Looks like a v. old codepath that's

[jira] [Commented] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065428#comment-16065428 ] Steve Loughran commented on SPARK-21137: ps, for now, do it in parallel:

[jira] [Commented] (SPARK-21137) Spark reads many small files slowly

2017-06-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065423#comment-16065423 ] Steve Loughran commented on SPARK-21137: Looking at this. something is trying to get the

[jira] [Resolved] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles resolved SPARK-21218. Resolution: Duplicate > Convert IN predicate to equivalent Parquet filter >

[jira] [Comment Edited] (SPARK-17091) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065413#comment-16065413 ] Michael Styles edited comment on SPARK-17091 at 6/27/17 8:26 PM: - By not

[jira] [Updated] (SPARK-17091) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-17091: --- Summary: Convert IN predicate to equivalent Parquet filter (was: ParquetFilters rewrite IN

[jira] [Updated] (SPARK-17091) ParquetFilters rewrite IN to OR of Eq

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles updated SPARK-17091: --- Attachment: IN Predicate.png OR Predicate.png > ParquetFilters rewrite IN to

[jira] [Commented] (SPARK-17091) ParquetFilters rewrite IN to OR of Eq

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065413#comment-16065413 ] Michael Styles commented on SPARK-17091: By not pushing the filter to Parquet, are we not

[jira] [Reopened] (SPARK-17091) ParquetFilters rewrite IN to OR of Eq

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Styles reopened SPARK-17091: > ParquetFilters rewrite IN to OR of Eq > - > >

[jira] [Comment Edited] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065351#comment-16065351 ] Michael Kunkel edited comment on SPARK-21215 at 6/27/17 7:37 PM: -

[jira] [Commented] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065348#comment-16065348 ] Sean Owen commented on SPARK-21215: --- Not sure what you're looking at, but the mailing list has posts

[jira] [Commented] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065351#comment-16065351 ] Michael Kunkel commented on SPARK-21215: [~sowen] I am not attempting to argue that facts. When I

[jira] [Comment Edited] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065354#comment-16065354 ] Michael Kunkel edited comment on SPARK-21215 at 6/27/17 7:40 PM: - The

[jira] [Commented] (SPARK-21230) Spark Encoder with mysql Enum and data truncated Error

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065347#comment-16065347 ] Michael Kunkel commented on SPARK-21230: The problem is with the Spark Encoder of type enum. So

[jira] [Commented] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065354#comment-16065354 ] Michael Kunkel commented on SPARK-21215: The posts go onto the list, but the owner ASF does not

[jira] [Commented] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065345#comment-16065345 ] Michael Kunkel commented on SPARK-21215: I looked at a few months worth of posts, and it seems

[jira] [Commented] (SPARK-21230) Spark Encoder with mysql Enum and data truncated Error

2017-06-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065338#comment-16065338 ] Sean Owen commented on SPARK-21230: --- This does also not look like a useful JIRA. It looks like a

[jira] [Commented] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065336#comment-16065336 ] Sean Owen commented on SPARK-21215: --- I'm not sure what you're referring to. The user@ list works fine.

[jira] [Created] (SPARK-21230) Spark Encoder with mysql Enum and data truncated Error

2017-06-27 Thread Michael Kunkel (JIRA)
Michael Kunkel created SPARK-21230: -- Summary: Spark Encoder with mysql Enum and data truncated Error Key: SPARK-21230 URL: https://issues.apache.org/jira/browse/SPARK-21230 Project: Spark

[jira] [Commented] (SPARK-21215) Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve

2017-06-27 Thread Michael Kunkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065329#comment-16065329 ] Michael Kunkel commented on SPARK-21215: The "resolution for this by [~sowen] was to put this on

[jira] [Commented] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065210#comment-16065210 ] Michael Styles commented on SPARK-21218: In Parquet 1.7, there as a bug involving corrupt

[jira] [Comment Edited] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Andrew Duffy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065177#comment-16065177 ] Andrew Duffy edited comment on SPARK-21218 at 6/27/17 5:39 PM: --- Curious, I

[jira] [Commented] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Andrew Duffy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065177#comment-16065177 ] Andrew Duffy commented on SPARK-21218: -- Curious, I wonder what the previous benchmarks were lacking.

[jira] [Commented] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065142#comment-16065142 ] Michael Styles commented on SPARK-21218: [~hyukjin.kwon] Not sure I understand what you want me

[jira] [Resolved] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-27 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19104. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 18418

[jira] [Assigned] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-27 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19104: --- Assignee: Liang-Chi Hsieh > CompileException with Map and Case Class in Spark 2.1.0 >

[jira] [Assigned] (SPARK-21229) remove QueryPlan.preCanonicalized

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21229: Assignee: Wenchen Fan (was: Apache Spark) > remove QueryPlan.preCanonicalized >

[jira] [Commented] (SPARK-21229) remove QueryPlan.preCanonicalized

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065118#comment-16065118 ] Apache Spark commented on SPARK-21229: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21229) remove QueryPlan.preCanonicalized

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21229: Assignee: Apache Spark (was: Wenchen Fan) > remove QueryPlan.preCanonicalized >

[jira] [Created] (SPARK-21229) remove QueryPlan.preCanonicalized

2017-06-27 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-21229: --- Summary: remove QueryPlan.preCanonicalized Key: SPARK-21229 URL: https://issues.apache.org/jira/browse/SPARK-21229 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065023#comment-16065023 ] Apache Spark commented on SPARK-18294: -- User 'jiangxb1987' has created a pull request for this

[jira] [Commented] (SPARK-20226) Call to sqlContext.cacheTable takes an incredibly long time in some cases

2017-06-27 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064972#comment-16064972 ] Barry Becker commented on SPARK-20226: -- Calling cache() on the dataframe on the after the addColumn

[jira] [Commented] (SPARK-21228) InSet incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064970#comment-16064970 ] Bogdan Raducanu commented on SPARK-21228: - InSubquery.doCodeGen is using InSet directly (although

[jira] [Commented] (SPARK-20002) Add support for unions between streaming and batch datasets

2017-06-27 Thread Leon Pham (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064947#comment-16064947 ] Leon Pham commented on SPARK-20002: --- We're actually reading data from two different sources and one of

[jira] [Commented] (SPARK-21228) InSet incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064945#comment-16064945 ] Bogdan Raducanu commented on SPARK-21228: - I tested manually (since there is no flag to disable

[jira] [Commented] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064880#comment-16064880 ] Hyukjin Kwon commented on SPARK-21218: -- Yea, I support this for what it worth. Let's resolve this as

[jira] [Updated] (SPARK-21228) InSet incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Description: In InSet it's possible that hset contains GenericInternalRows while child

[jira] [Updated] (SPARK-21228) InSet incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Summary: InSet incorrect handling of structs (was: InSet.doCodeGen incorrect handling of

[jira] [Updated] (SPARK-21228) InSet.doCodeGen incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Description: In InSet it's possible that hset contains GenericInternalRows while child

[jira] [Updated] (SPARK-21228) InSet.doCodeGen incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Description: In InSet it's possible that hset contains GenericInternalRows while child

[jira] [Updated] (SPARK-21228) InSet.doCodeGen incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Description: In InSet it's possible that hset contains GenericInternalRows while child

[jira] [Updated] (SPARK-21228) InSet.doCodeGen incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Description: In InSet it's possible that hset contains GenericInternalRows while child

[jira] [Updated] (SPARK-21228) InSet.doCodeGen incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bogdan Raducanu updated SPARK-21228: Description: In InSet it's possible that hset contains GenericInternalRows while child

[jira] [Created] (SPARK-21228) InSet.doCodeGen incorrect handling of structs

2017-06-27 Thread Bogdan Raducanu (JIRA)
Bogdan Raducanu created SPARK-21228: --- Summary: InSet.doCodeGen incorrect handling of structs Key: SPARK-21228 URL: https://issues.apache.org/jira/browse/SPARK-21228 Project: Spark Issue

[jira] [Comment Edited] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2017-06-27 Thread Dominic Ricard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064833#comment-16064833 ] Dominic Ricard edited comment on SPARK-21067 at 6/27/17 1:31 PM: -

[jira] [Commented] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2017-06-27 Thread Dominic Ricard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064833#comment-16064833 ] Dominic Ricard commented on SPARK-21067: [~q79969786], yes. As stated in the description, ours is

[jira] [Updated] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Seydou Dia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seydou Dia updated SPARK-21227: --- Description: Hi, please find below the step to reproduce the issue I am facing. First I create a

[jira] [Created] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Seydou Dia (JIRA)
Seydou Dia created SPARK-21227: -- Summary: Unicode in Json field causes AnalysisException when selecting from Dataframe Key: SPARK-21227 URL: https://issues.apache.org/jira/browse/SPARK-21227 Project:

[jira] [Updated] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-27 Thread Seydou Dia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seydou Dia updated SPARK-21227: --- Description: Hi, please find below the step to reproduce the issue I am facing, {code:python} $

[jira] [Assigned] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21176: Assignee: Apache Spark > Master UI hangs with spark.ui.reverseProxy=true if the master

[jira] [Commented] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064747#comment-16064747 ] Apache Spark commented on SPARK-21176: -- User 'IngoSchuster' has created a pull request for this

[jira] [Assigned] (SPARK-21176) Master UI hangs with spark.ui.reverseProxy=true if the master node has many CPUs

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21176: Assignee: (was: Apache Spark) > Master UI hangs with spark.ui.reverseProxy=true if

[jira] [Comment Edited] (SPARK-21218) Convert IN predicate to equivalent Parquet filter

2017-06-27 Thread Michael Styles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064608#comment-16064608 ] Michael Styles edited comment on SPARK-21218 at 6/27/17 12:17 PM: -- By

[jira] [Assigned] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20073: Assignee: Apache Spark > Unexpected Cartesian product when using eqNullSafe in join with

[jira] [Assigned] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20073: Assignee: (was: Apache Spark) > Unexpected Cartesian product when using eqNullSafe in

[jira] [Commented] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064729#comment-16064729 ] Apache Spark commented on SPARK-20073: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-21226) Save empty dataframe in pyspark prints nothing

2017-06-27 Thread Carlos M. Casas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064706#comment-16064706 ] Carlos M. Casas commented on SPARK-21226: - The error is a different way of writing what

[jira] [Commented] (SPARK-21225) decrease the Mem using for variable 'tasks' in function resourceOffers

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064682#comment-16064682 ] Apache Spark commented on SPARK-21225: -- User 'JackYangzg' has created a pull request for this issue:

[jira] [Commented] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064676#comment-16064676 ] Sean Owen commented on SPARK-21223: --- [~gostop_zlx] this overlaps a lot with SPARK-21078. Can you look

[jira] [Updated] (SPARK-21226) Save empty dataframe in pyspark prints nothing

2017-06-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21226: -- Priority: Minor (was: Major) What is the error? > Save empty dataframe in pyspark prints nothing >

[jira] [Assigned] (SPARK-21225) decrease the Mem using for variable 'tasks' in function resourceOffers

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21225: Assignee: (was: Apache Spark) > decrease the Mem using for variable 'tasks' in

[jira] [Assigned] (SPARK-21225) decrease the Mem using for variable 'tasks' in function resourceOffers

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21225: Assignee: Apache Spark > decrease the Mem using for variable 'tasks' in function

[jira] [Commented] (SPARK-21225) decrease the Mem using for variable 'tasks' in function resourceOffers

2017-06-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064658#comment-16064658 ] Apache Spark commented on SPARK-21225: -- User 'JackYangzg' has created a pull request for this issue:

[jira] [Updated] (SPARK-21225) decrease the Mem using for variable 'tasks' in function resourceOffers

2017-06-27 Thread yangZhiguo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangZhiguo updated SPARK-21225: --- Description: In the function 'resourceOffers', It declare a variable 'tasks' for storage the

  1   2   >