[jira] [Commented] (SPARK-16633) lag/lead does not return the default value when the offset row does not exist

2016-07-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385411#comment-15385411 ] Yin Huai commented on SPARK-16633: -- Seems this bug only affects cases that use a constant as the column

[jira] [Updated] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-16632: --- Assignee: Marcelo Vanzin > Vectorized parquet reader fails to read certain fields from Hive tables >

[jira] [Commented] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385399#comment-15385399 ] Cheng Lian commented on SPARK-16632: [~vanzin] Did you post the wrong stack trace? This issue is

[jira] [Commented] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame

2016-07-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385394#comment-15385394 ] Dongjoon Hyun commented on SPARK-16464: --- Oh, I totally forgot that I had mentioned the same

[jira] [Commented] (SPARK-16641) Add an Option to Create a Dataset With a Case Class, Ignoring Column Names (Using ordinal instead)

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385393#comment-15385393 ] Reynold Xin commented on SPARK-16641: - I thought about a boolean flag, but in general boolean flags

[jira] [Commented] (SPARK-16641) Add an Option to Create a Dataset With a Case Class, Ignoring Column Names (Using ordinal instead)

2016-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385391#comment-15385391 ] Wenchen Fan commented on SPARK-16641: - a workaround is

[jira] [Commented] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame

2016-07-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385389#comment-15385389 ] Dongjoon Hyun commented on SPARK-16464: --- Here is the situation regeneration. **1.6.x Branch**

[jira] [Commented] (SPARK-16641) Add an Option to Create a Dataset With a Case Class, Ignoring Column Names (Using ordinal instead)

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385376#comment-15385376 ] Reynold Xin commented on SPARK-16641: - cc [~cloud_fan] / [~marmbrus] > Add an Option to Create a

[jira] [Commented] (SPARK-16641) Add an Option to Create a Dataset With a Case Class, Ignoring Column Names (Using ordinal instead)

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385375#comment-15385375 ] Reynold Xin commented on SPARK-16641: - Maybe {code} spark.read.option("delimiter",

[jira] [Created] (SPARK-16641) Add an Option to Create a Dataset With a Case Class, Ignoring Column Names (Using ordinal instead)

2016-07-19 Thread Pat McDonough (JIRA)
Pat McDonough created SPARK-16641: - Summary: Add an Option to Create a Dataset With a Case Class, Ignoring Column Names (Using ordinal instead) Key: SPARK-16641 URL:

[jira] [Commented] (SPARK-15694) Implement ScriptTransformation in sql/core

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385353#comment-15385353 ] Reynold Xin commented on SPARK-15694: - It makes it impossible to run script transform without the

[jira] [Commented] (SPARK-16613) RDD.pipe returns values for empty partitions

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385350#comment-15385350 ] Reynold Xin commented on SPARK-16613: - But the problem of result changing depending on partitioning

[jira] [Assigned] (SPARK-16640) Add codegen for Elt function

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16640: Assignee: Apache Spark > Add codegen for Elt function > > >

[jira] [Commented] (SPARK-16640) Add codegen for Elt function

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385348#comment-15385348 ] Apache Spark commented on SPARK-16640: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16640) Add codegen for Elt function

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16640: Assignee: (was: Apache Spark) > Add codegen for Elt function >

[jira] [Created] (SPARK-16640) Add codegen for Elt function

2016-07-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-16640: --- Summary: Add codegen for Elt function Key: SPARK-16640 URL: https://issues.apache.org/jira/browse/SPARK-16640 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-16632. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14272

[jira] [Commented] (SPARK-2183) Avoid loading/shuffling data twice in self-join query

2016-07-19 Thread Emma Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385332#comment-15385332 ] Emma Tang commented on SPARK-2183: -- Hitting the same problem, the data is being loaded twice. Caching the

[jira] [Commented] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame

2016-07-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385315#comment-15385315 ] Dongjoon Hyun commented on SPARK-16464: --- Thank you, [~proflin]. May I work on this? I'm now

[jira] [Commented] (SPARK-16639) query fails if having condition contains grouping column

2016-07-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385312#comment-15385312 ] Wenchen Fan commented on SPARK-16639: - 1.6 also fails > query fails if having condition contains

[jira] [Closed] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu closed SPARK-16638. -- Resolution: Not A Problem > The L2 regularization of LinearRegression seems wrong when standardization

[jira] [Commented] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385306#comment-15385306 ] Weichen Xu commented on SPARK-16638: seems i'm wrong, the intention of author may be to use w[i] /

[jira] [Created] (SPARK-16639) query fails if having condition contains grouping column

2016-07-19 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-16639: --- Summary: query fails if having condition contains grouping column Key: SPARK-16639 URL: https://issues.apache.org/jira/browse/SPARK-16639 Project: Spark Issue

[jira] [Commented] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame

2016-07-19 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385263#comment-15385263 ] Shivaram Venkataraman commented on SPARK-16464: --- Thanks [~proflin] for checking this. In

[jira] [Comment Edited] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame

2016-07-19 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385255#comment-15385255 ] Liwei Lin edited comment on SPARK-16464 at 7/20/16 3:30 AM: Hi [~shivaram],

[jira] [Commented] (SPARK-16464) withColumn() allows illegal creation of duplicate column names on DataFrame

2016-07-19 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385255#comment-15385255 ] Liwei Lin commented on SPARK-16464: --- In scala, {{withColumn}}'s behavior is "adding a column or

[jira] [Commented] (SPARK-16613) RDD.pipe returns values for empty partitions

2016-07-19 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385235#comment-15385235 ] Tejas Patil commented on SPARK-16613: - [~srowen] , [~rxin] : I feel that invoking the pipe command

[jira] [Updated] (SPARK-16633) lag/lead does not return the default value when the offset row does not exist

2016-07-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-16633: - Target Version/s: 2.0.0 > lag/lead does not return the default value when the offset row does not exist

[jira] [Updated] (SPARK-16633) lag/lead does not return the default value when the offset row does not exist

2016-07-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-16633: - Description: Please see the attached notebook. Seems lag/lead somehow fail to recognize that a offset

[jira] [Updated] (SPARK-16296) add null check for key when create map data in encoder

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16296: Target Version/s: 2.0.1 (was: 2.0.0, 2.0.1) > add null check for key when create map data in

[jira] [Updated] (SPARK-16629) UDTs can not be compared to DataTypes in dataframes.

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16629: Target Version/s: 2.0.1 (was: 2.0.0) > UDTs can not be compared to DataTypes in dataframes. >

[jira] [Commented] (SPARK-16601) Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports

2016-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385209#comment-15385209 ] Xiao Li commented on SPARK-16601: - When you quote it by backticks, it will be treated as the tableName.

[jira] [Commented] (SPARK-16601) Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports

2016-07-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385210#comment-15385210 ] Xiao Li commented on SPARK-16601: - Please remove the backticks and retry it. Thanks~ > Spark2.0 fail in

[jira] [Assigned] (SPARK-10683) Source code missing for SparkR test JAR

2016-07-19 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman reassigned SPARK-10683: - Assignee: Shivaram Venkataraman > Source code missing for SparkR test

[jira] [Updated] (SPARK-16582) Explicitly define isNull = false for non-nullable expressions

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16582: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16557) Remove stale doc in sql/README.md

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16557: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16528) HiveClientImpl throws NPE when reading database from a custom metastore

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16528: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16112) R programming guide update for gapply and gapplyCollect

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16112: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 > R

[jira] [Updated] (SPARK-16615) Expose sqlContext in SparkSession

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16615: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16590) Improve LogicalPlanToSQLSuite to check generated SQL directly

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16590: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16507) Add CRAN checks to SparkR

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16507: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 > Add

[jira] [Updated] (SPARK-16553) Typo in Spark SQL Programming guide that links to examples

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16553: Fix Version/s: (was: 2.0.1) 2.0.0 > Typo in Spark SQL Programming guide

[jira] [Updated] (SPARK-16584) Move regexp unit tests to RegexpExpressionsSuite

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16584: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16230) Executors self-killing after being assigned tasks while still in init

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16230: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16055) sparkR.init() can not load sparkPackages when executing an R file

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16055: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16540) Jars specified with --jars will added twice when running on YARN

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16540: Fix Version/s: (was: 2.0.1) 2.0.0 > Jars specified with --jars will added

[jira] [Updated] (SPARK-16588) Deprecate monotonicallyIncreasingId in Scala

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16588: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16600) fix latex formula syntax error in mllib

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16600: Fix Version/s: (was: 2.0.1) 2.0.0 > fix latex formula syntax error in mllib

[jira] [Updated] (SPARK-16510) Move SparkR test JAR into Spark, include its source code

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16510: Fix Version/s: (was: 2.0.1) (was: 2.1.0) 2.0.0 >

[jira] [Updated] (SPARK-16555) Work around Jekyll error-handling bug which led to silent doc build failures

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16555: Fix Version/s: (was: 2.0.1) 2.0.0 > Work around Jekyll error-handling bug

[jira] [Resolved] (SPARK-16568) update sql programing guide refreshTable API

2016-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-16568. - Resolution: Fixed Assignee: Weichen Xu Fix Version/s: 2.1.0

[jira] [Resolved] (SPARK-10683) Source code missing for SparkR test JAR

2016-07-19 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman resolved SPARK-10683. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue

[jira] [Resolved] (SPARK-16510) Move SparkR test JAR into Spark, include its source code

2016-07-19 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman resolved SPARK-16510. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue

[jira] [Updated] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-16638: --- Description: The original L2 is 0.5 * effectiveL2regParam * sigma( wi^2 ) (wi is the coefficients we

[jira] [Assigned] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16638: Assignee: Apache Spark > The L2 regularization of LinearRegression seems wrong when

[jira] [Assigned] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16638: Assignee: (was: Apache Spark) > The L2 regularization of LinearRegression seems wrong

[jira] [Commented] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385200#comment-15385200 ] Apache Spark commented on SPARK-16638: -- User 'WeichenXu123' has created a pull request for this

[jira] [Commented] (SPARK-15694) Implement ScriptTransformation in sql/core

2016-07-19 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385197#comment-15385197 ] Tejas Patil commented on SPARK-15694: - [~rxin] : I haven't started working on this. I might get some

[jira] [Created] (SPARK-16638) The L2 regularization of LinearRegression seems wrong when standardization is false

2016-07-19 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-16638: -- Summary: The L2 regularization of LinearRegression seems wrong when standardization is false Key: SPARK-16638 URL: https://issues.apache.org/jira/browse/SPARK-16638

[jira] [Commented] (SPARK-16628) OrcConversions should not convert an ORC table represented by MetastoreRelation to HadoopFsRelation if metastore schema does not match schema stored in ORC files

2016-07-19 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385184#comment-15385184 ] Tejas Patil commented on SPARK-16628: - Thanks for notifying [~yhuai]. Is this specific to ORC only ?

[jira] [Assigned] (SPARK-16637) Support Mesos Unified Containerizer

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16637: Assignee: Apache Spark > Support Mesos Unified Containerizer >

[jira] [Assigned] (SPARK-16637) Support Mesos Unified Containerizer

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16637: Assignee: (was: Apache Spark) > Support Mesos Unified Containerizer >

[jira] [Commented] (SPARK-16637) Support Mesos Unified Containerizer

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385153#comment-15385153 ] Apache Spark commented on SPARK-16637: -- User 'mgummelt' has created a pull request for this issue:

[jira] [Resolved] (SPARK-12437) Reserved words (like table) throws error when writing a data frame to JDBC

2016-07-19 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong resolved SPARK-12437. Resolution: Duplicate > Reserved words (like table) throws error when writing a data frame to JDBC

[jira] [Commented] (SPARK-12437) Reserved words (like table) throws error when writing a data frame to JDBC

2016-07-19 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385148#comment-15385148 ] Sean Zhong commented on SPARK-12437: This issue is fixed by SPARK-16387. The column names are quoted

[jira] [Created] (SPARK-16637) Support Mesos Unified Containerizer

2016-07-19 Thread Michael Gummelt (JIRA)
Michael Gummelt created SPARK-16637: --- Summary: Support Mesos Unified Containerizer Key: SPARK-16637 URL: https://issues.apache.org/jira/browse/SPARK-16637 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-16636) Missing documentation for CalendarIntervalType type in sql-programming-guide.md

2016-07-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385127#comment-15385127 ] Hyukjin Kwon commented on SPARK-16636: -- I will cc you, [~rxin] [~cloud_fan] just in case. > Missing

[jira] [Created] (SPARK-16636) Missing documentation for CalendarIntervalType type in sql-programming-guide.md

2016-07-19 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-16636: Summary: Missing documentation for CalendarIntervalType type in sql-programming-guide.md Key: SPARK-16636 URL: https://issues.apache.org/jira/browse/SPARK-16636

[jira] [Created] (SPARK-16635) Provide Session support in the Spark UI

2016-07-19 Thread Tao Lin (JIRA)
Tao Lin created SPARK-16635: --- Summary: Provide Session support in the Spark UI Key: SPARK-16635 URL: https://issues.apache.org/jira/browse/SPARK-16635 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-16633) lag/lead does not return the default value when the offset row does not exist

2016-07-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-16633: --- Attachment: window_function_bug.html JIRA went down right before [~yhuai] tried to upload the

[jira] [Resolved] (SPARK-14702) Expose SparkLauncher's ProcessBuilder for user flexibility

2016-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-14702. Resolution: Fixed Assignee: Andrew Duffy Fix Version/s: 2.1.0 > Expose

[jira] [Commented] (SPARK-9140) Replace TimeTracker by Stopwatch

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385082#comment-15385082 ] Apache Spark commented on SPARK-9140: - User 'MechCoder' has created a pull request for this issue:

[jira] [Commented] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385068#comment-15385068 ] Apache Spark commented on SPARK-16632: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16632: Assignee: (was: Apache Spark) > Vectorized parquet reader fails to read certain

[jira] [Assigned] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16632: Assignee: Apache Spark > Vectorized parquet reader fails to read certain fields from Hive

[jira] [Issue Comment Deleted] (SPARK-2183) Avoid loading/shuffling data twice in self-join query

2016-07-19 Thread Emma Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emma Tang updated SPARK-2183: - Comment: was deleted (was: I'm bumping into the same issue here with a self join. However, caching the

[jira] [Assigned] (SPARK-16634) GenericArrayData can't be loaded in certain JVMs

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16634: Assignee: (was: Apache Spark) > GenericArrayData can't be loaded in certain JVMs >

[jira] [Commented] (SPARK-16634) GenericArrayData can't be loaded in certain JVMs

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385056#comment-15385056 ] Apache Spark commented on SPARK-16634: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16634) GenericArrayData can't be loaded in certain JVMs

2016-07-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16634: Assignee: Apache Spark > GenericArrayData can't be loaded in certain JVMs >

[jira] [Created] (SPARK-16634) GenericArrayData can't be loaded in certain JVMs

2016-07-19 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-16634: -- Summary: GenericArrayData can't be loaded in certain JVMs Key: SPARK-16634 URL: https://issues.apache.org/jira/browse/SPARK-16634 Project: Spark Issue

[jira] [Commented] (SPARK-16533) Spark application not handling preemption messages

2016-07-19 Thread Lucas Winkelmann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385051#comment-15385051 ] Lucas Winkelmann commented on SPARK-16533: -- Glad I am not the only one. I would attach my error

[jira] [Commented] (SPARK-16533) Spark application not handling preemption messages

2016-07-19 Thread Emaad Manzoor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385048#comment-15385048 ] Emaad Manzoor commented on SPARK-16533: --- Hopefully someone with more experience with Spark chimes

[jira] [Commented] (SPARK-16533) Spark application not handling preemption messages

2016-07-19 Thread Lucas Winkelmann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385045#comment-15385045 ] Lucas Winkelmann commented on SPARK-16533: -- Increasing this number also did not change anything.

[jira] [Comment Edited] (SPARK-2183) Avoid loading/shuffling data twice in self-join query

2016-07-19 Thread Emma Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385019#comment-15385019 ] Emma Tang edited comment on SPARK-2183 at 7/19/16 11:04 PM: I'm bumping into

[jira] [Commented] (SPARK-16611) Expose several hidden DataFrame/RDD functions

2016-07-19 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385033#comment-15385033 ] Felix Cheung commented on SPARK-16611: -- there's also spark.lapply > Expose several hidden

[jira] [Comment Edited] (SPARK-2183) Avoid loading/shuffling data twice in self-join query

2016-07-19 Thread Emma Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385019#comment-15385019 ] Emma Tang edited comment on SPARK-2183 at 7/19/16 11:04 PM: I'm bumping into

[jira] [Updated] (SPARK-16589) Chained cartesian produces incorrect number of records

2016-07-19 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-16589: --- Description: Chaining cartesian calls in PySpark results in the number of records

[jira] [Updated] (SPARK-16589) Chained cartesian produces incorrect number of records

2016-07-19 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-16589: --- Affects Version/s: 1.4.0 1.5.0 > Chained cartesian produces

[jira] [Commented] (SPARK-2183) Avoid loading/shuffling data twice in self-join query

2016-07-19 Thread Emma Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385019#comment-15385019 ] Emma Tang commented on SPARK-2183: -- I'm bumping into the same issue here with a self join. However,

[jira] [Created] (SPARK-16633) lag/lead does not return the default value when the offset row does not exist

2016-07-19 Thread Yin Huai (JIRA)
Yin Huai created SPARK-16633: Summary: lag/lead does not return the default value when the offset row does not exist Key: SPARK-16633 URL: https://issues.apache.org/jira/browse/SPARK-16633 Project: Spark

[jira] [Commented] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385001#comment-15385001 ] Marcelo Vanzin commented on SPARK-16632: I mean that because it only considers the type specified

[jira] [Commented] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384996#comment-15384996 ] Yin Huai commented on SPARK-16632: -- [~vanzin] So, you mean that OnHeapColumnVector does not reserve

[jira] [Updated] (SPARK-16344) Array of struct with a single field name "element" can't be decoded from Parquet files written by Spark 1.6+

2016-07-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-16344: - Target Version/s: 2.1.0 (was: 2.0.0) > Array of struct with a single field name "element" can't be

[jira] [Created] (SPARK-16632) Vectorized parquet reader fails to read certain fields from Hive tables

2016-07-19 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-16632: -- Summary: Vectorized parquet reader fails to read certain fields from Hive tables Key: SPARK-16632 URL: https://issues.apache.org/jira/browse/SPARK-16632 Project:

[jira] [Commented] (SPARK-10627) Regularization for artificial neural networks

2016-07-19 Thread Ruben Janssen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384830#comment-15384830 ] Ruben Janssen commented on SPARK-10627: --- Would it be possible to separate these out into multiple

[jira] [Created] (SPARK-16631) Stopping sparkcontext does not shutdown fileserver

2016-07-19 Thread Howard (JIRA)
Howard created SPARK-16631: -- Summary: Stopping sparkcontext does not shutdown fileserver Key: SPARK-16631 URL: https://issues.apache.org/jira/browse/SPARK-16631 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-16533) Spark application not handling preemption messages

2016-07-19 Thread Lucas Winkelmann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384820#comment-15384820 ] Lucas Winkelmann commented on SPARK-16533: -- Looking into the container ID's I did find some

[jira] [Resolved] (SPARK-14808) Spark MLlib, GraphX, SparkR 2.0 QA umbrella

2016-07-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14808. --- Resolution: Done Fix Version/s: 2.0.0 > Spark MLlib, GraphX, SparkR 2.0 QA

[jira] [Resolved] (SPARK-14817) ML, Graph, R 2.0 QA: Programming guide update and migration guide

2016-07-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14817. --- Resolution: Fixed Fix Version/s: 2.0.0 Closing... > ML, Graph, R 2.0 QA:

[jira] [Commented] (SPARK-14808) Spark MLlib, GraphX, SparkR 2.0 QA umbrella

2016-07-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384795#comment-15384795 ] Joseph K. Bradley commented on SPARK-14808: --- I agree with those semantics; I'm just getting

  1   2   3   >