[jira] [Created] (SPARK-11578) User facing api for typed aggregation

2015-11-08 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-11578: Summary: User facing api for typed aggregation Key: SPARK-11578 URL: https://issues.apache.org/jira/browse/SPARK-11578 Project: Spark Issue Type

[jira] [Updated] (SPARK-11453) append data to partitioned table will messes up the result

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11453: - Assignee: Wenchen Fan > append data to partitioned table will messes up the res

[jira] [Updated] (SPARK-11453) append data to partitioned table will messes up the result

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11453: - Target Version/s: 1.6.0 > append data to partitioned table will messes up the res

[jira] [Resolved] (SPARK-9241) Supporting multiple DISTINCT columns

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-9241. - Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9406 [https

[jira] [Updated] (SPARK-9241) Supporting multiple DISTINCT columns

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9241: Assignee: Herman van Hovell > Supporting multiple DISTINCT colu

[jira] [Updated] (SPARK-11500) Not deterministic order of columns when using merging schemas.

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11500: - Target Version/s: 1.6.0 > Not deterministic order of columns when using merging sche

[jira] [Updated] (SPARK-11546) Thrift server makes too many logs about result schema

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11546: - Assignee: Navis > Thrift server makes too many logs about result sch

[jira] [Resolved] (SPARK-11546) Thrift server makes too many logs about result schema

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11546. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9514

Re: [BUILD SYSTEM] quick jenkins downtime, november 5th 7am

2015-11-06 Thread Michael Armbrust
I'm noticing several problems with Jenkins since the upgrade. PR comments say: "Build started sha1 is merged." instead of actually printing the hash Also: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45246/console GitHub pull request #9527 of commit

[jira] [Updated] (SPARK-9301) collect_set and collect_list aggregate functions

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9301: Priority: Critical (was: Major) > collect_set and collect_list aggregate functi

Re: [VOTE] Release Apache Spark 1.5.2 (RC2)

2015-11-06 Thread Michael Armbrust
+1 On Fri, Nov 6, 2015 at 9:27 AM, Chester Chen wrote: > +1 > Test against CDH5.4.2 with hadoop 2.6.0 version using yesterday's code, > build locally. > > Regression running in Yarn Cluster mode against few internal ML ( logistic > regression, linear regression, random

[jira] [Resolved] (SPARK-11450) Add support for UnsafeRow to Expand

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11450. -- Resolution: Fixed Fix Version/s: 1.6.0 > Add support for UnsafeRow to Exp

[jira] [Updated] (SPARK-11450) Add support for UnsafeRow to Expand

2015-11-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11450: - Assignee: Herman van Hovell > Add support for UnsafeRow to Exp

Re: Fwd: Re: DataFrame equality does not working in 1.5.1

2015-11-06 Thread Michael Armbrust
In particular this is sounding like: https://issues.apache.org/jira/browse/SPARK-10859 On Fri, Nov 6, 2015 at 1:05 PM, Michael Armbrust <mich...@databricks.com> wrote: > I would be great if you could try sql("SET > spark.sql.inMemoryColumnarStorage.partitionPruning=false&

Re: Unable to register UDF with StructType

2015-11-06 Thread Michael Armbrust
e pointers for creating Dynamic > Case Classes. > > TIA. > > > On Fri, Nov 6, 2015 at 12:20 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> You are returning the type StructType not an instance of a struct (i.e. >> StringType instead of &qu

Re: Fwd: Re: DataFrame equality does not working in 1.5.1

2015-11-06 Thread Michael Armbrust
I would be great if you could try sql("SET spark.sql.inMemoryColumnarStorage.partitionPruning=false") also, try Spark 1.5.2-RC2 On Fri, Nov 6, 2015 at 4:49 AM, Seongduk Cheon wrote: > Hi Yanal! > > Yes,

[jira] [Commented] (SPARK-11470) Figure out a good name for the public API

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992233#comment-14992233 ] Michael Armbrust commented on SPARK-11470: -- The more I think about this, the more that I think

[jira] [Created] (SPARK-11528) Typed-safe aggregations

2015-11-05 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-11528: Summary: Typed-safe aggregations Key: SPARK-11528 URL: https://issues.apache.org/jira/browse/SPARK-11528 Project: Spark Issue Type: Sub-task

Re: Spark 1.6 Release Schedule

2015-11-05 Thread Michael Armbrust
ore an RC), non-Blocker non-bugs > untargeted, or in a few cases pushed to 1.6.1 or beyond > > 4. After next week, non-Blocker and non-Critical bugs are pushed, as the > RC is then late. > > 5. No release candidate until no Blockers are open. > > 6. (Repeat 1 and 2 more reg

[jira] [Updated] (SPARK-7148) Configure Parquet block size (row group size) for ML model import/export

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7148: Target Version/s: (was: 1.6.0) > Configure Parquet block size (row group size) for

[jira] [Updated] (SPARK-10954) Parquet version in the "created_by" metadata field of Parquet files written by Spark 1.5 and 1.6 is wrong

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10954: - Target Version/s: (was: 1.6.0) > Parquet version in the "created_by" m

[jira] [Commented] (SPARK-7148) Configure Parquet block size (row group size) for ML model import/export

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992693#comment-14992693 ] Michael Armbrust commented on SPARK-7148: - Is this still a problem? Newer versions of parquet

[jira] [Resolved] (SPARK-11533) [SPARK-11447] Null comparison requires type information but type extraction fails for complex types

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11533. -- Resolution: Invalid > [SPARK-11447] Null comparison requires type information but t

[jira] [Updated] (SPARK-11540) QueryExecutionListener

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11540: - Priority: Blocker (was: Major) > QueryExecutionListe

[jira] [Updated] (SPARK-6413) For data source tables, we should provide better output for DESCRIBE FORMATTED

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-6413: Target Version/s: (was: 1.6.0) > For data source tables, we should provide better out

[jira] [Updated] (SPARK-10519) Investigate if we should encode timezone information to a timestamp value stored in JSON

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10519: - Target Version/s: (was: 1.6.0) > Investigate if we should encode timezone informat

[jira] [Updated] (SPARK-10180) JDBCRDD does not process EqualNullSafe filter.

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10180: - Target Version/s: (was: 1.6.0) > JDBCRDD does not process EqualNullSafe fil

[jira] [Updated] (SPARK-11450) Add support for UnsafeRow to Expand

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11450: - Target Version/s: (was: 1.6.0) > Add support for UnsafeRow to Exp

[jira] [Resolved] (SPARK-9673) Use unbiased standard deviation in DataFrame.describe

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-9673. - Resolution: Fixed Looks like this was fixed by: https://github.com/apache/spark/pull/6297

[jira] [Updated] (SPARK-11451) Support single distinct count on multiple columns

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11451: - Target Version/s: (was: 1.6.0) > Support single distinct count on multiple colu

[jira] [Updated] (SPARK-8115) Remove TestData

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8115: Target Version/s: (was: 1.6.0) > Remove TestD

[jira] [Updated] (SPARK-11011) UserDefinedType serialization should be strongly typed

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11011: - Target Version/s: (was: 1.6.0) > UserDefinedType serialization should be stron

[jira] [Commented] (SPARK-10519) Investigate if we should encode timezone information to a timestamp value stored in JSON

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992678#comment-14992678 ] Michael Armbrust commented on SPARK-10519: -- Is this a moot point now that timestamp is tungesten

Re: Spark SQL supports operating on a thrift data sources

2015-11-05 Thread Michael Armbrust
This would make an awesome spark-packge. I'd suggest looking at spark-avro as an example: https://github.com/databricks/spark-avro On Thu, Nov 5, 2015 at 11:21 AM, Jaydeep Vishwakarma < jaydeep.vishwaka...@inmobi.com> wrote: > Hi, > > I want to load thrift serialised data through sqlcontext and

[jira] [Resolved] (SPARK-11528) Typed-safe aggregations

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11528. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9499

Re: Unable to register UDF with StructType

2015-11-05 Thread Michael Armbrust
You are returning the type StructType not an instance of a struct (i.e. StringType instead of "string"). If you'd like to return a struct you should return a case class. case class StringInfo(numChars: Int, firstLetter: String) udf((s: String) => StringInfo(s.size, s.head)) If you'd like to

[jira] [Resolved] (SPARK-11188) Elide stacktraces in bin/spark-sql for AnalysisExceptions

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11188. -- Resolution: Fixed Fix Version/s: (was: 1.6.0) 1.4.2

Re: Guava ClassLoading Issue When Using Different Hive Metastore Version

2015-11-05 Thread Michael Armbrust
I would be in favor of limiting the scope here. The problem you might run into is that FinalizableReferenceQueue uses the

[jira] [Closed] (SPARK-11470) Figure out a good name for the public API

2015-11-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-11470. Resolution: Won't Fix We reverted the API > Figure out a good name for the public

[jira] [Updated] (SPARK-2973) Use LocalRelation for all ExecutedCommands, avoid job for take/collect()

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2973: Target Version/s: (was: 1.6.0) > Use LocalRelation for all ExecutedCommands, avoid

[jira] [Updated] (SPARK-4131) Support "Writing data into the filesystem from queries"

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4131: Target Version/s: (was: 1.6.0) > Support "Writing data into the filesystem from

[jira] [Updated] (SPARK-9988) Create local (external) sort operator

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9988: Target Version/s: (was: 1.6.0) > Create local (external) sort opera

[jira] [Updated] (SPARK-9989) Create local sort-merge join operator

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9989: Target Version/s: (was: 1.6.0) > Create local sort-merge join opera

[jira] [Updated] (SPARK-9987) Create local aggregate operator

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9987: Target Version/s: (was: 1.6.0) > Create local aggregate opera

[jira] [Updated] (SPARK-9689) Cache doesn't refresh for HadoopFsRelation based table

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9689: Target Version/s: (was: 1.6.0) > Cache doesn't refresh for HadoopFsRelation based ta

[jira] [Updated] (SPARK-9879) OOM in LIMIT clause with large number

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9879: Target Version/s: (was: 1.6.0) > OOM in LIMIT clause with large num

[jira] [Updated] (SPARK-7549) Support aggregating over nested fields

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7549: Target Version/s: (was: 1.6.0) > Support aggregating over nested fie

[jira] [Commented] (SPARK-9697) Project Tungsten (Spark 1.6)

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987094#comment-14987094 ] Michael Armbrust commented on SPARK-9697: - [~rxin] can you update this now that we are past code

[jira] [Updated] (SPARK-9860) Join: Determine the join strategy (broadcast join or shuffle join) at runtime

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9860: Target Version/s: (was: 1.6.0) > Join: Determine the join strategy (broadcast j

Re: How to handle Option[Int] in dataframe

2015-11-03 Thread Michael Armbrust
In Spark 1.6 there is an experimental new features called Datasets. You can call df.as[Student] and it should do what you want. Would love any feedback you have if you get a chance to try it out (we'll hopefully publish a preview release next week). On Mon, Nov 2, 2015 at 9:30 PM, manas kar

[jira] [Updated] (SPARK-8122) ParquetRelation.enableLogForwarding() may fail to configure loggers

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8122: Issue Type: Bug (was: Sub-task) Parent: (was: SPARK-5463

[jira] [Resolved] (SPARK-5463) Improve Parquet support (reliability, performance, and error messages)

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5463. - Resolution: Fixed > Improve Parquet support (reliability, performance, and error messa

[jira] [Updated] (SPARK-5463) Improve Parquet support (reliability, performance, and error messages)

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5463: Assignee: Cheng Lian > Improve Parquet support (reliability, performance, and er

[jira] [Updated] (SPARK-9995) Create local Python evaluation operator

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9995: Target Version/s: (was: 1.6.0) > Create local Python evaluation opera

[jira] [Updated] (SPARK-8964) Use Exchange in limit operations (per partition limit -> exchange to one partition -> per partition limit)

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8964: Target Version/s: (was: 1.6.0) > Use Exchange in limit operations (per partition li

[jira] [Updated] (SPARK-8745) Remove GenerateProjection

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8745: Target Version/s: (was: 1.6.0) > Remove GenerateProject

[jira] [Updated] (SPARK-11196) Support for equality and pushdown of filters on some UDTs

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11196: - Target Version/s: (was: 1.6.0) > Support for equality and pushdown of filters on s

[jira] [Updated] (SPARK-10818) Query optimization: investigate whether we need a separate optimizer from Spark SQL's

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10818: - Target Version/s: (was: 1.6.0) > Query optimization: investigate whether we n

[jira] [Updated] (SPARK-7970) Optimize code for SQL queries fired on Union of RDDs (closure cleaner)

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7970: Target Version/s: (was: 1.6.0) > Optimize code for SQL queries fired on Union of R

[jira] [Updated] (SPARK-7903) PythonUDT shouldn't get serialized on the Scala side

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7903: Target Version/s: (was: 1.6.0) > PythonUDT shouldn't get serialized on the Scala s

[jira] [Updated] (SPARK-7245) Spearman correlation for DataFrames

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7245: Target Version/s: (was: 1.6.0) > Spearman correlation for DataFra

[jira] [Updated] (SPARK-10815) API design: data sources and sinks

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10815: - Target Version/s: (was: 1.6.0) > API design: data sources and si

[jira] [Updated] (SPARK-8144) For PySpark SQL, automatically convert values provided in readwriter options to string

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8144: Target Version/s: (was: 1.6.0) > For PySpark SQL, automatically convert values provi

[jira] [Updated] (SPARK-8108) Build Hive module by default (i.e. remove -Phive profile)

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8108: Target Version/s: 2+ (was: 1.6.0) > Build Hive module by default (i.e. remove -Ph

[jira] [Updated] (SPARK-9372) For a join operator, rows with null equal join key expression can be filtered out early

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9372: Target Version/s: (was: 1.6.0) > For a join operator, rows with null equal join

[jira] [Updated] (SPARK-10297) When save data to a data source table, we should bound the size of a saved file

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10297: - Target Version/s: (was: 1.6.0) > When save data to a data source table, we sho

[jira] [Updated] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9182: Target Version/s: (was: 1.6.0) > filter and groupBy on DataFrames are not passed thro

[jira] [Updated] (SPARK-10146) Have an easy way to set data source reader/writer specific confs

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10146: - Target Version/s: (was: 1.6.0) > Have an easy way to set data source reader/wri

[jira] [Updated] (SPARK-9139) Add backwards-compatibility tests for DataType.fromJson()

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9139: Target Version/s: (was: 1.6.0) > Add backwards-compatibility tests for DataType.fromJ

[jira] [Updated] (SPARK-10343) Consider nullability of expression in codegen

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10343: - Target Version/s: (was: 1.6.0) > Consider nullability of expression in code

[jira] [Updated] (SPARK-9431) TimeIntervalType for for time intervals

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9431: Target Version/s: (was: 1.6.0) > TimeIntervalType for for time interv

[jira] [Resolved] (SPARK-9410) Better Multi-User Session Semantics for SQL Context

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-9410. - Resolution: Fixed Assignee: Davies Liu Fix Version/s: 1.6.0 > Bet

[jira] [Commented] (SPARK-7492) Convert LocalDataFrame to LocalMatrix

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987098#comment-14987098 ] Michael Armbrust commented on SPARK-7492: - Are we still trying to get this in for 1.6? > Conv

[jira] [Updated] (SPARK-10823) API design: external state management

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10823: - Target Version/s: (was: 1.6.0) > API design: external state managem

[jira] [Updated] (SPARK-10819) Logical plan: determine logical operators needed

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10819: - Target Version/s: (was: 1.6.0) > Logical plan: determine logical operators nee

[jira] [Updated] (SPARK-10820) Physical plan: determine physical operators needed

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10820: - Target Version/s: (was: 1.6.0) > Physical plan: determine physical operators nee

[jira] [Updated] (SPARK-10814) API design: convergence of batch and streaming DataFrame

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10814: - Target Version/s: (was: 1.6.0) > API design: convergence of batch and stream

[jira] [Updated] (SPARK-10813) API design: high level class structuring regarding windowed and non-windowed streams

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10813: - Target Version/s: (was: 1.6.0) > API design: high level class structuring regard

[jira] [Updated] (SPARK-10803) Allow users to write and query Parquet user-defined key-value metadata directly

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10803: - Target Version/s: (was: 1.6.0) > Allow users to write and query Parquet user-defi

[jira] [Updated] (SPARK-10816) API design: window and session specification

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10816: - Target Version/s: (was: 1.6.0) > API design: window and session specificat

[jira] [Updated] (SPARK-9557) Refactor ParquetFilterSuite and remove old ParquetFilters code

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9557: Target Version/s: (was: 1.6.0) > Refactor ParquetFilterSuite and remove

[jira] [Updated] (SPARK-8682) Range Join for Spark SQL

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8682: Target Version/s: (was: 1.6.0) > Range Join for Spark

[jira] [Updated] (SPARK-8641) Native Spark Window Functions

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-8641: Target Version/s: (was: 1.6.0) > Native Spark Window Functi

[jira] [Updated] (SPARK-2870) Thorough schema inference directly on RDDs of Python dictionaries

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2870: Target Version/s: (was: 1.6.0) > Thorough schema inference directly on RDDs of Pyt

[jira] [Updated] (SPARK-9850) Adaptive execution in Spark

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9850: Target Version/s: (was: 1.6.0) > Adaptive execution in Sp

[jira] [Closed] (SPARK-10270) Add/Replace some Java friendly DataFrame API

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-10270. Resolution: Won't Fix > Add/Replace some Java friendly DataFrame

[jira] [Updated] (SPARK-9701) allow not automatically using HiveContext with spark-shell when hive support built in

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9701: Target Version/s: (was: 1.6.0) > allow not automatically using HiveContext with sp

[jira] [Updated] (SPARK-9783) Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9783: Target Version/s: (was: 1.6.0) > Use SqlNewHadoopRDD in JSONRelation to eliminate ex

[jira] [Updated] (SPARK-9876) Upgrade parquet-mr to 1.8.1

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9876: Target Version/s: (was: 1.6.0) > Upgrade parquet-mr to 1.

[jira] [Updated] (SPARK-10621) Audit function names in FunctionRegistry and corresponding method names shown in functions.scala and functions.py

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-10621: - Priority: Critical (was: Major) > Audit function names in FunctionRegis

[jira] [Resolved] (SPARK-10429) MutableProjection should evaluate all expressions first and then update the mutable row

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-10429. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9422

[jira] [Updated] (SPARK-11412) Support merge schema for ORC

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11412: - Summary: Support merge schema for ORC (was: mergeSchema option not working for orc

[jira] [Updated] (SPARK-11412) Support merge schema for ORC

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11412: - Issue Type: New Feature (was: Bug) > Support merge schema for

[jira] [Updated] (SPARK-9357) Remove JoinedRow

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9357: Target Version/s: (was: 1.6.0) > Remove Joined

[jira] [Updated] (SPARK-9487) Use the same num. worker threads in Scala/Python unit tests

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9487: Target Version/s: (was: 1.6.0) > Use the same num. worker threads in Scala/Python u

[jira] [Resolved] (SPARK-11436) we should rebind right encoder when join 2 datasets

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-11436. -- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 9391

[jira] [Updated] (SPARK-11436) we should rebind right encoder when join 2 datasets

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-11436: - Assignee: Wenchen Fan > we should rebind right encoder when join 2 datas

[jira] [Commented] (SPARK-10681) DateTimeUtils needs a method to parse string to SQL's timestamp value

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987067#comment-14987067 ] Michael Armbrust commented on SPARK-10681: -- Can we bump this from 1.6? > DateTimeUtils ne

[jira] [Updated] (SPARK-9604) Unsafe ArrayData and MapData is very very slow

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9604: Target Version/s: (was: 1.6.0) > Unsafe ArrayData and MapData is very very s

[jira] [Updated] (SPARK-5517) Add input types for Java UDFs

2015-11-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5517: Target Version/s: (was: 1.6.0) > Add input types for Java U

<    11   12   13   14   15   16   17   18   19   20   >