[jira] [Created] (SPARK-10630) createDataFrame from a Java List

2015-09-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-10630: - Summary: createDataFrame from a Java List Key: SPARK-10630 URL: https://issues.apache.org/jira/browse/SPARK-10630 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-10631) Add missing API doc in pyspark.mllib.linalg.Vector

2015-09-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-10631: - Summary: Add missing API doc in pyspark.mllib.linalg.Vector Key: SPARK-10631 URL: https://issues.apache.org/jira/browse/SPARK-10631 Project: Spark Issue

[jira] [Resolved] (SPARK-10516) Add values as a property to DenseVector in PySpark

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-10516. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8682

[jira] [Commented] (SPARK-10630) createDataFrame from a Java List

2015-09-16 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746968#comment-14746968 ] holdenk commented on SPARK-10630: - Sounds good to me :) I'll give it a shot :) > createDataFrame from a

[jira] [Updated] (SPARK-5314) java.lang.OutOfMemoryError in SparkSQL with GROUP BY

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5314: - Assignee: Michael Armbrust > java.lang.OutOfMemoryError in SparkSQL with GROUP BY >

[jira] [Updated] (SPARK-5397) Assigning aliases to several return values of an UDF

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5397: - Assignee: Michael Armbrust > Assigning aliases to several return values of an UDF >

[jira] [Updated] (SPARK-5302) Add support for SQLContext "partition" columns

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5302: - Assignee: Cheng Lian > Add support for SQLContext "partition" columns >

[jira] [Updated] (SPARK-5421) SparkSql throw OOM at shuffle

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5421: - Assignee: Michael Armbrust > SparkSql throw OOM at shuffle > - > >

[jira] [Updated] (SPARK-8786) Create a wrapper for BinaryType

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-8786: - Assignee: Josh Rosen > Create a wrapper for BinaryType > --- > >

[jira] [Updated] (SPARK-6632) Optimize the parquetSchema to metastore schema reconciliation, so that the process is delegated to each map task itself

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6632: - Assignee: Michael Armbrust > Optimize the parquetSchema to metastore schema reconciliation, so that the

[jira] [Commented] (SPARK-10631) Add missing API doc in pyspark.mllib.linalg.Vector

2015-09-16 Thread Vinod KC (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746994#comment-14746994 ] Vinod KC commented on SPARK-10631: -- I'm working on this > Add missing API doc in

[jira] [Commented] (SPARK-10577) [PySpark] DataFrame hint for broadcast join

2015-09-16 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747058#comment-14747058 ] Maciej BryƄski commented on SPARK-10577: Tested this patch today on a 1.5.0. Works great. Thank

[jira] [Updated] (SPARK-10508) incorrect evaluation of searched case expression

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10508: -- Assignee: Josh Rosen > incorrect evaluation of searched case expression >

[jira] [Updated] (SPARK-10632) Cannot save DataFrame with User Defined Types

2015-09-16 Thread Joao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joao updated SPARK-10632: - Description: Cannot save DataFrames that contain user-defined types. At first I thought it was a problem with my

[jira] [Updated] (SPARK-10632) Cannot save DataFrame with User Defined Types

2015-09-16 Thread Joao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joao updated SPARK-10632: - Description: Cannot save DataFrames that contain user-defined types. I tried to save a dataframe with instances

[jira] [Updated] (SPARK-3231) select on a table in parquet format containing smallint as a field type does not work

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3231: - Assignee: Alex Rovner > select on a table in parquet format containing smallint as a field type does >

[jira] [Updated] (SPARK-3617) Configurable case sensitivity

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3617: - Assignee: Michael Armbrust > Configurable case sensitivity > - > >

[jira] [Updated] (SPARK-2824) Allow saving Parquet files to the HiveMetastore

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2824: - Assignee: Michael Armbrust > Allow saving Parquet files to the HiveMetastore >

[jira] [Updated] (SPARK-2695) Figure out a good way to handle NullType columns.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2695: - Assignee: Michael Armbrust > Figure out a good way to handle NullType columns. >

[jira] [Created] (SPARK-10632) Cannot save DataFrame with User Defined Types

2015-09-16 Thread Joao (JIRA)
Joao created SPARK-10632: Summary: Cannot save DataFrame with User Defined Types Key: SPARK-10632 URL: https://issues.apache.org/jira/browse/SPARK-10632 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-10629) Gradient boosted trees: mapPartitions input size increasing

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747088#comment-14747088 ] Sean Owen commented on SPARK-10629: --- That sounds like the same issue in SPARK-10433; it's not clear

[jira] [Commented] (SPARK-10485) IF expression is not correctly resolved when one of the options have NullType

2015-09-16 Thread Antonio Jesus Navarro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747089#comment-14747089 ] Antonio Jesus Navarro commented on SPARK-10485: --- Now, the bug is into the checking of the

[jira] [Comment Edited] (SPARK-10485) IF expression is not correctly resolved when one of the options have NullType

2015-09-16 Thread Antonio Jesus Navarro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14747089#comment-14747089 ] Antonio Jesus Navarro edited comment on SPARK-10485 at 9/16/15 7:32 AM:

[jira] [Updated] (SPARK-3700) Improve the performance of scanning JSON datasets

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3700: - Assignee: Yanbo Liang > Improve the performance of scanning JSON datasets >

[jira] [Updated] (SPARK-3833) Allow Spark SQL SchemaRDDs to be merged

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3833: - Assignee: Michael Armbrust > Allow Spark SQL SchemaRDDs to be merged >

[jira] [Updated] (SPARK-3978) Schema change on Spark-Hive (Parquet file format) table not working

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3978: - Assignee: Alex Rovner > Schema change on Spark-Hive (Parquet file format) table not working >

[jira] [Updated] (SPARK-5738) Reuse mutable row for each record at jsonStringToRow

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5738: - Assignee: Yanbo Liang > Reuse mutable row for each record at jsonStringToRow >

[jira] [Updated] (SPARK-3804) Output of Generator expressions is not stable after serialization.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3804: - Assignee: Michael Armbrust > Output of Generator expressions is not stable after serialization. >

[jira] [Updated] (SPARK-5823) Reuse mutable rows for inner structures when parsing JSON objects

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5823: - Assignee: Yanbo Liang > Reuse mutable rows for inner structures when parsing JSON objects >

[jira] [Resolved] (SPARK-10511) Source releases should not include maven jars

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10511. --- Resolution: Fixed Fix Version/s: 1.5.1 1.6.0 Resolved by

[jira] [Updated] (SPARK-4559) Adding support for ucase and lcase

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4559: - Assignee: Michael Armbrust > Adding support for ucase and lcase > -- > >

[jira] [Updated] (SPARK-4273) Providing ExternalSet to avoid OOM when count(distinct)

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4273: - Assignee: Michael Armbrust > Providing ExternalSet to avoid OOM when count(distinct) >

[jira] [Updated] (SPARK-5109) Loading multiple parquet files into a single SchemaRDD

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5109: - Assignee: Michael Armbrust > Loading multiple parquet files into a single SchemaRDD >

[jira] [Updated] (SPARK-10515) When killing executor, the pending replacement executors will be lost

2015-09-16 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-10515: -- Summary: When killing executor, the pending replacement executors will be lost (was: When

[jira] [Updated] (SPARK-9032) scala.MatchError in DataFrameReader.json(String path)

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-9032: - Assignee: Josh Rosen > scala.MatchError in DataFrameReader.json(String path) >

[jira] [Updated] (SPARK-9343) DROP TABLE ignores IF EXISTS clause

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-9343: - Assignee: Michael Armbrust > DROP TABLE ignores IF EXISTS clause > --- >

[jira] [Updated] (SPARK-9033) scala.MatchError: interface java.util.Map (of class java.lang.Class) with Spark SQL

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-9033: - Assignee: Josh Rosen > scala.MatchError: interface java.util.Map (of class java.lang.Class) with > Spark

[jira] [Updated] (SPARK-10437) Support aggregation expressions in Order By

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10437: -- Assignee: Liang-Chi Hsieh > Support aggregation expressions in Order By >

[jira] [Updated] (SPARK-10475) improve column prunning for Project on Sort

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10475: -- Assignee: Wenchen Fan > improve column prunning for Project on Sort >

[jira] [Created] (SPARK-10633) Persisting Spark stream to MySQL - Spark tries to create the table for every stream even if it exist already.

2015-09-16 Thread Lunen (JIRA)
Lunen created SPARK-10633: - Summary: Persisting Spark stream to MySQL - Spark tries to create the table for every stream even if it exist already. Key: SPARK-10633 URL: https://issues.apache.org/jira/browse/SPARK-10633

[jira] [Updated] (SPARK-10633) Persisting Spark stream to MySQL - Spark tries to create the table for every stream even if it exist already.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10633: -- Priority: Major (was: Blocker) [~lunendl] Please have a look at

[jira] [Resolved] (SPARK-10267) Add @Since annotation to ml.util

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-10267. --- Resolution: Not A Problem Assignee: Ehsan Mohyedin Kermani > Add @Since annotation to

[jira] [Updated] (SPARK-10631) Add missing API doc in pyspark.mllib.linalg.Vector

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10631: -- Assignee: Vinod KC > Add missing API doc in pyspark.mllib.linalg.Vector >

[jira] [Resolved] (SPARK-10276) Add @since annotation to pyspark.mllib.recommendation

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-10276. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8677

[jira] [Commented] (SPARK-6810) Performance benchmarks for SparkR

2015-09-16 Thread Yashwanth Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768853#comment-14768853 ] Yashwanth Kumar commented on SPARK-6810: Hi Shivaram Venkataraman, I would like to try this. >

[jira] [Commented] (SPARK-10262) Add @Since annotation to ml.attribute

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768840#comment-14768840 ] Xiangrui Meng commented on SPARK-10262: --- [~tijoparacka] Are you still working on this? > Add

[jira] [Commented] (SPARK-9492) LogisticRegression in R should provide model statistics

2015-09-16 Thread Yashwanth Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768850#comment-14768850 ] Yashwanth Kumar commented on SPARK-9492: Can i Have this task? > LogisticRegression in R should

[jira] [Created] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Prachi Burathoki (JIRA)
Prachi Burathoki created SPARK-10634: Summary: The spark sql fails if the where clause contains a string with " in it. Key: SPARK-10634 URL: https://issues.apache.org/jira/browse/SPARK-10634

[jira] [Commented] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768938#comment-14768938 ] Sean Owen commented on SPARK-10634: --- Shouldn't the " be escaped in some way? or else I'm not sure how

[jira] [Commented] (SPARK-10614) SystemClock uses non-monotonic time in its wait logic

2015-09-16 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14768900#comment-14768900 ] Steve Loughran commented on SPARK-10614: Having done a little more detailed research on the

[jira] [Updated] (SPARK-10635) pyspark - running on a different host

2015-09-16 Thread Ben Duffield (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Duffield updated SPARK-10635: - Description: At various points we assume we only ever talk to a driver on the same host. e.g.

[jira] [Commented] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14769057#comment-14769057 ] Sean Owen commented on SPARK-10634: --- Don't you need to use "" to quote double quotes? I actually am not

[jira] [Commented] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Prachi Burathoki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14769011#comment-14769011 ] Prachi Burathoki commented on SPARK-10634: -- I tried by escaping with \, but still same error

[jira] [Updated] (SPARK-10635) pyspark - running on a different host

2015-09-16 Thread Ben Duffield (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Duffield updated SPARK-10635: - Description: At various points we assume we only ever talk to a driver on the same host. e.g.

[jira] [Updated] (SPARK-10635) pyspark - running on a different host

2015-09-16 Thread Ben Duffield (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Duffield updated SPARK-10635: - Description: At various points we assume we only ever talk to a driver on the same host. e.g.

[jira] [Created] (SPARK-10635) pyspark - running on a different host

2015-09-16 Thread Ben Duffield (JIRA)
Ben Duffield created SPARK-10635: Summary: pyspark - running on a different host Key: SPARK-10635 URL: https://issues.apache.org/jira/browse/SPARK-10635 Project: Spark Issue Type:

[jira] [Commented] (SPARK-6810) Performance benchmarks for SparkR

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790700#comment-14790700 ] Xiangrui Meng commented on SPARK-6810: -- the ml part (glm) is about the same. all computation is on

[jira] [Commented] (SPARK-10602) Univariate statistics as UDAFs: single-pass continuous stats

2015-09-16 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790713#comment-14790713 ] Seth Hendrickson commented on SPARK-10602: -- Right now I have working versions of single pass

[jira] [Created] (SPARK-10637) DataFrames: saving with nested User Data Types

2015-09-16 Thread Joao (JIRA)
Joao created SPARK-10637: Summary: DataFrames: saving with nested User Data Types Key: SPARK-10637 URL: https://issues.apache.org/jira/browse/SPARK-10637 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-10636) RDD filter does not work after if..then..else RDD blocks

2015-09-16 Thread Glenn Strycker (JIRA)
Glenn Strycker created SPARK-10636: -- Summary: RDD filter does not work after if..then..else RDD blocks Key: SPARK-10636 URL: https://issues.apache.org/jira/browse/SPARK-10636 Project: Spark

[jira] [Commented] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Prachi Burathoki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790634#comment-14790634 ] Prachi Burathoki commented on SPARK-10634: -- I tried escaping" with both \ and ".But got the same

[jira] [Reopened] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-10634: --- > The spark sql fails if the where clause contains a string with " in it. >

[jira] [Assigned] (SPARK-9296) variance, var_pop, and var_samp aggregate functions

2015-09-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9296: --- Assignee: Apache Spark > variance, var_pop, and var_samp aggregate functions >

[jira] [Commented] (SPARK-9296) variance, var_pop, and var_samp aggregate functions

2015-09-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790663#comment-14790663 ] Apache Spark commented on SPARK-9296: - User 'JihongMA' has created a pull request for this issue:

[jira] [Updated] (SPARK-10637) DataFrames: saving with nested User Data Types

2015-09-16 Thread Joao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joao updated SPARK-10637: - Description: Cannot save data frames using nested UserDefinedType I wrote a simple example to show the error.

[jira] [Commented] (SPARK-6810) Performance benchmarks for SparkR

2015-09-16 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790654#comment-14790654 ] Shivaram Venkataraman commented on SPARK-6810: -- So the DataFrame API doesn't need much of

[jira] [Commented] (SPARK-10636) RDD filter does not work after if..then..else RDD blocks

2015-09-16 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790681#comment-14790681 ] Glenn Strycker commented on SPARK-10636: I didn't "forget", I believed that "RDD = if {} else {}

[jira] [Closed] (SPARK-10636) RDD filter does not work after if..then..else RDD blocks

2015-09-16 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Glenn Strycker closed SPARK-10636. -- > RDD filter does not work after if..then..else RDD blocks >

[jira] [Closed] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Prachi Burathoki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prachi Burathoki closed SPARK-10634. Resolution: Done > The spark sql fails if the where clause contains a string with " in it.

[jira] [Resolved] (SPARK-10636) RDD filter does not work after if..then..else RDD blocks

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10636. --- Resolution: Not A Problem In the first case, your {{.filter}} statement plainly applies only to the

[jira] [Resolved] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10634. --- Resolution: Not A Problem > The spark sql fails if the where clause contains a string with " in it.

[jira] [Closed] (SPARK-10634) The spark sql fails if the where clause contains a string with " in it.

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-10634. - > The spark sql fails if the where clause contains a string with " in it. >

[jira] [Assigned] (SPARK-9296) variance, var_pop, and var_samp aggregate functions

2015-09-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9296: --- Assignee: (was: Apache Spark) > variance, var_pop, and var_samp aggregate functions >

[jira] [Updated] (SPARK-10632) Cannot save DataFrame with User Defined Types

2015-09-16 Thread Joao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joao updated SPARK-10632: - Description: Cannot save DataFrames that contain user-defined types. I tried to save a dataframe with instances

[jira] [Created] (SPARK-10642) Crash in rdd.lookup() with "java.lang.Long cannot be cast to java.lang.Integer"

2015-09-16 Thread Thouis Jones (JIRA)
Thouis Jones created SPARK-10642: Summary: Crash in rdd.lookup() with "java.lang.Long cannot be cast to java.lang.Integer" Key: SPARK-10642 URL: https://issues.apache.org/jira/browse/SPARK-10642

[jira] [Commented] (SPARK-4440) Enhance the job progress API to expose more information

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790935#comment-14790935 ] Josh Rosen commented on SPARK-4440: --- Does anyone still want these extensions? If so, can you please come

[jira] [Resolved] (SPARK-4442) Move common unit test utilities into their own package / module

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4442. --- Resolution: Won't Fix We're using test-jar dependencies instead, so this is "Won't Fix". > Move

[jira] [Created] (SPARK-10645) Bivariate Statistics for continuous vs. continuous

2015-09-16 Thread Jihong MA (JIRA)
Jihong MA created SPARK-10645: - Summary: Bivariate Statistics for continuous vs. continuous Key: SPARK-10645 URL: https://issues.apache.org/jira/browse/SPARK-10645 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-10602) Univariate statistics as UDAFs: single-pass continuous stats

2015-09-16 Thread Jihong MA (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790816#comment-14790816 ] Jihong MA commented on SPARK-10602: --- I go ahead/ created SPARK-10641, since this JIRA is not listed as

[jira] [Updated] (SPARK-10485) IF expression is not correctly resolved when one of the options have NullType

2015-09-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10485: - Affects Version/s: 1.5.0 > IF expression is not correctly resolved when one of the

[jira] [Updated] (SPARK-10643) Support HDFS urls in spark-submit

2015-09-16 Thread Alan Braithwaite (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Braithwaite updated SPARK-10643: - Description: When using mesos with docker and marathon, it would be nice to be able to

[jira] [Updated] (SPARK-10641) skewness and kurtosis support

2015-09-16 Thread Jihong MA (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jihong MA updated SPARK-10641: -- Issue Type: Sub-task (was: New Feature) Parent: SPARK-10384 > skewness and kurtosis support >

[jira] [Resolved] (SPARK-10589) Add defense against external site framing

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-10589. --- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8745

[jira] [Resolved] (SPARK-2991) RDD transforms for scan and scanLeft

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2991. --- Resolution: Won't Fix > RDD transforms for scan and scanLeft > -

[jira] [Resolved] (SPARK-3497) Report serialized size of task binary

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3497. --- Resolution: Fixed We now have an automatic warning-level log message for large closures. > Report

[jira] [Commented] (SPARK-10636) RDD filter does not work after if..then..else RDD blocks

2015-09-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790792#comment-14790792 ] Sean Owen commented on SPARK-10636: --- It's a Scala syntax issue, as you say when you wondered if you're

[jira] [Commented] (SPARK-10602) Univariate statistics as UDAFs: single-pass continuous stats

2015-09-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790859#comment-14790859 ] Joseph K. Bradley commented on SPARK-10602: --- Yeah, JIRA only allows 2 levels of subtasks (a

[jira] [Resolved] (SPARK-869) Retrofit rest of RDD api to use proper serializer type

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-869. -- Resolution: Done Going to resolve this as Done; please open a new JIRA if you find specific examples

[jira] [Assigned] (SPARK-10640) Spark history server fails to parse taskEndReasonFromJson TaskCommitDenied

2015-09-16 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-10640: - Assignee: Thomas Graves > Spark history server fails to parse taskEndReasonFromJson

[jira] [Updated] (SPARK-10371) Optimize sequential projections

2015-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10371: -- Description: In ML pipelines, each transformer/estimator appends new columns to the input

[jira] [Resolved] (SPARK-3489) support rdd.zip(rdd1, rdd2,...) with variable number of rdds as params

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-3489. --- Resolution: Won't Fix Resolving as "Won't Fix" per PR discussion. > support rdd.zip(rdd1, rdd2,...)

[jira] [Created] (SPARK-10644) Applications wait even if free executors are available

2015-09-16 Thread Balagopal Nair (JIRA)
Balagopal Nair created SPARK-10644: -- Summary: Applications wait even if free executors are available Key: SPARK-10644 URL: https://issues.apache.org/jira/browse/SPARK-10644 Project: Spark

[jira] [Resolved] (SPARK-4568) Publish release candidates under $VERSION-RCX instead of $VERSION

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4568. --- Resolution: Fixed We now do this. > Publish release candidates under $VERSION-RCX instead of

[jira] [Comment Edited] (SPARK-4568) Publish release candidates under $VERSION-RCX instead of $VERSION

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790945#comment-14790945 ] Josh Rosen edited comment on SPARK-4568 at 9/16/15 7:00 PM: We now do this.

[jira] [Commented] (SPARK-4216) Eliminate duplicate Jenkins GitHub posts from AMPLab

2015-09-16 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791016#comment-14791016 ] Nicholas Chammas commented on SPARK-4216: - Thanks Josh! > Eliminate duplicate Jenkins GitHub

[jira] [Created] (SPARK-10641) skewness and kurtosis support

2015-09-16 Thread Jihong MA (JIRA)
Jihong MA created SPARK-10641: - Summary: skewness and kurtosis support Key: SPARK-10641 URL: https://issues.apache.org/jira/browse/SPARK-10641 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-10320) Kafka Support new topic subscriptions without requiring restart of the streaming context

2015-09-16 Thread Sudarshan Kadambi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790804#comment-14790804 ] Sudarshan Kadambi commented on SPARK-10320: --- Sure, a function as proposed that allows for the

[jira] [Created] (SPARK-10643) Support HDFS urls in spark-submit

2015-09-16 Thread Alan Braithwaite (JIRA)
Alan Braithwaite created SPARK-10643: Summary: Support HDFS urls in spark-submit Key: SPARK-10643 URL: https://issues.apache.org/jira/browse/SPARK-10643 Project: Spark Issue Type: New

[jira] [Resolved] (SPARK-4738) Update the netty-3.x version in spark-assembly-*.jar

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4738. --- Resolution: Incomplete Resolving as "Incomplete" since this an old issue and it doesn't look like

[jira] [Closed] (SPARK-4087) Only use broadcast for large tasks

2015-09-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen closed SPARK-4087. - Resolution: Won't Fix > Only use broadcast for large tasks > -- > >

  1   2   3   >