[jira] [Created] (SPARK-4898) Replace cloudpickle with Dill

2014-12-19 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-4898: - Summary: Replace cloudpickle with Dill Key: SPARK-4898 URL: https://issues.apache.org/jira/browse/SPARK-4898 Project: Spark Issue Type: Bug Components: P

[jira] [Updated] (SPARK-4897) Python 3 support

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4897: -- Description: It would be nice to have Python 3 support in PySpark, provided that we can do it in a way

[jira] [Commented] (SPARK-4886) Support cache control for each partition of a Hive partitioned table

2014-12-19 Thread guowei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253138#comment-14253138 ] guowei commented on SPARK-4886: --- use "CACHE TABLE ... AS SELECT..." > Support cache contro

[jira] [Updated] (SPARK-3619) Upgrade to Mesos 0.21 to work around MESOS-1688

2014-12-19 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-3619: -- Assignee: Timothy Chen > Upgrade to Mesos 0.21 to work around MESOS-1688 > -

[jira] [Commented] (SPARK-3619) Upgrade to Mesos 0.21 to work around MESOS-1688

2014-12-19 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253150#comment-14253150 ] Andrew Ash commented on SPARK-3619: --- [~activars] Spark 1.2.0 is being released with a Me

[jira] [Updated] (SPARK-3619) Upgrade to Mesos 0.21 to work around MESOS-1688

2014-12-19 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-3619: -- Description: The Mesos 0.21 release has a fix for https://issues.apache.org/jira/browse/MESOS-1688, whic

[jira] [Commented] (SPARK-4886) Support cache control for each partition of a Hive partitioned table

2014-12-19 Thread Xudong Zheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253154#comment-14253154 ] Xudong Zheng commented on SPARK-4886: - Hi Guowei, "CACHE TABLE ... AS SELECT..." will

[jira] [Created] (SPARK-4899) Support Mesos features: roles and checkpoints

2014-12-19 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-4899: - Summary: Support Mesos features: roles and checkpoints Key: SPARK-4899 URL: https://issues.apache.org/jira/browse/SPARK-4899 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-4872) Provide sample format of training/test data in MLlib programming guide

2014-12-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253171#comment-14253171 ] Sean Owen commented on SPARK-4872: -- [~zhjunwei] This is not at all specific to Spark. No,

[jira] [Commented] (SPARK-4094) checkpoint should still be available after rdd actions

2014-12-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253191#comment-14253191 ] Sean Owen commented on SPARK-4094: -- [~liyezhang556520] But this is exactly what the doc s

[jira] [Commented] (SPARK-2075) Anonymous classes are missing from Spark distribution

2014-12-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253203#comment-14253203 ] Sean Owen commented on SPARK-2075: -- [~sunrui] From digging in to the various reports of t

[jira] [Created] (SPARK-4900) MLlib SingularValueDecomposition ARPACK IllegalStateException

2014-12-19 Thread Mike Beyer (JIRA)
Mike Beyer created SPARK-4900: - Summary: MLlib SingularValueDecomposition ARPACK IllegalStateException Key: SPARK-4900 URL: https://issues.apache.org/jira/browse/SPARK-4900 Project: Spark I

[jira] [Created] (SPARK-4901) Hot fix for the BytesWritable.copyBytes not exists in Hadoop1

2014-12-19 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4901: Summary: Hot fix for the BytesWritable.copyBytes not exists in Hadoop1 Key: SPARK-4901 URL: https://issues.apache.org/jira/browse/SPARK-4901 Project: Spark Issue Ty

[jira] [Commented] (SPARK-4901) Hot fix for the BytesWritable.copyBytes not exists in Hadoop1

2014-12-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253306#comment-14253306 ] Apache Spark commented on SPARK-4901: - User 'chenghao-intel' has created a pull reques

[jira] [Updated] (SPARK-4900) MLlib SingularValueDecomposition ARPACK IllegalStateException

2014-12-19 Thread Mike Beyer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Beyer updated SPARK-4900: -- Priority: Major (was: Blocker) > MLlib SingularValueDecomposition ARPACK IllegalStateException > -

[jira] [Updated] (SPARK-3373) Filtering operations should optionally rebuild routing tables

2014-12-19 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3373: Target Version/s: 1.3.0, 1.2.1 (was: 1.1.1, 1.2.0) Affects Version/s: (was: 1.0.2)

[jira] [Updated] (SPARK-3373) Filtering operations should optionally rebuild routing tables

2014-12-19 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3373: Priority: Major (was: Minor) > Filtering operations should optionally rebuild routing tables >

[jira] [Created] (SPARK-4902) gap-sampling performance optimization

2014-12-19 Thread Guoqiang Li (JIRA)
Guoqiang Li created SPARK-4902: -- Summary: gap-sampling performance optimization Key: SPARK-4902 URL: https://issues.apache.org/jira/browse/SPARK-4902 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-4844) SGD should support custom sampling.

2014-12-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li resolved SPARK-4844. Resolution: Won't Fix See: SPARK-4902 > SGD should support custom sampling. > -

[jira] [Commented] (SPARK-3619) Upgrade to Mesos 0.21 to work around MESOS-1688

2014-12-19 Thread Jing Dong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253394#comment-14253394 ] Jing Dong commented on SPARK-3619: -- Has anyone succeed to run Spark 1.1.1 on Mesos 0.21?

[jira] [Updated] (SPARK-4902) gap-sampling performance optimization

2014-12-19 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-4902: --- Description: {{CacheManager.getOrCompute}} returns an instance of InterruptibleIterator that contains

[jira] [Created] (SPARK-4903) RDD remains cached after "DROP TABLE"

2014-12-19 Thread Evert Lammerts (JIRA)
Evert Lammerts created SPARK-4903: - Summary: RDD remains cached after "DROP TABLE" Key: SPARK-4903 URL: https://issues.apache.org/jira/browse/SPARK-4903 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-4903) RDD remains cached after "DROP TABLE"

2014-12-19 Thread Evert Lammerts (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evert Lammerts updated SPARK-4903: -- Description: In beeline, when I run: {code:sql} CREATE TABLE test AS select col from table; CACH

[jira] [Commented] (SPARK-4902) gap-sampling performance optimization

2014-12-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253409#comment-14253409 ] Apache Spark commented on SPARK-4902: - User 'witgo' has created a pull request for thi

[jira] [Created] (SPARK-4904) Remove the foldable checking in HiveGenericUdf.eval

2014-12-19 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4904: Summary: Remove the foldable checking in HiveGenericUdf.eval Key: SPARK-4904 URL: https://issues.apache.org/jira/browse/SPARK-4904 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-4904) Remove the foldable checking in HiveGenericUdf.eval

2014-12-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253418#comment-14253418 ] Apache Spark commented on SPARK-4904: - User 'chenghao-intel' has created a pull reques

[jira] [Commented] (SPARK-4867) UDF clean up

2014-12-19 Thread William Benton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253579#comment-14253579 ] William Benton commented on SPARK-4867: --- [~marmbrus] I actually think exposing an in

[jira] [Updated] (SPARK-4901) Hot fix for the BytesWritable.copyBytes not exists in Hadoop1

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4901: -- Assignee: Cheng Hao > Hot fix for the BytesWritable.copyBytes not exists in Hadoop1 > --

[jira] [Resolved] (SPARK-4901) Hot fix for the BytesWritable.copyBytes not exists in Hadoop1

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4901. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 3742 [https://github.com/

[jira] [Commented] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-12-19 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253586#comment-14253586 ] Ted Malaska commented on SPARK-2447: Hey guy, Just wanted to update this jira. In su

[jira] [Updated] (SPARK-3686) flume.SparkSinkSuite.Success is flaky

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3686: -- Labels: flaky-test (was: ) > flume.SparkSinkSuite.Success is flaky > --

[jira] [Updated] (SPARK-3912) FlumeStreamSuite is flaky, fails either with port binding issues or data not being reliably sent

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-3912: -- Labels: flaky-test (was: ) > FlumeStreamSuite is flaky, fails either with port binding issues or data n

[jira] [Updated] (SPARK-1603) flaky test case in StreamingContextSuite

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1603: -- Labels: flaky-test (was: ) > flaky test case in StreamingContextSuite > ---

[jira] [Updated] (SPARK-4053) Block generator throttling in NetworkReceiverSuite is flaky

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4053: -- Labels: flaky-test (was: ) > Block generator throttling in NetworkReceiverSuite is flaky >

[jira] [Updated] (SPARK-1158) Fix flaky RateLimitedOutputStreamSuite

2014-12-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1158: -- Labels: flaky-test (was: ) > Fix flaky RateLimitedOutputStreamSuite > -

[jira] [Created] (SPARK-4905) Flaky FlumeStreamSuite test: org.apache.spark.streaming.flume.FlumeStreamSuite.flume input stream

2014-12-19 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-4905: - Summary: Flaky FlumeStreamSuite test: org.apache.spark.streaming.flume.FlumeStreamSuite.flume input stream Key: SPARK-4905 URL: https://issues.apache.org/jira/browse/SPARK-4905

[jira] [Commented] (SPARK-4869) The variable names in IF statement of Spark SQL doesn't resolve to its value.

2014-12-19 Thread Arnab (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253659#comment-14253659 ] Arnab commented on SPARK-4869: -- Can you kindly clarify what DAYS_30 refers to. I tried out a

[jira] [Created] (SPARK-4906) Spark master OOMs with exception stack trace stored in JobProgressListener

2014-12-19 Thread Mingyu Kim (JIRA)
Mingyu Kim created SPARK-4906: - Summary: Spark master OOMs with exception stack trace stored in JobProgressListener Key: SPARK-4906 URL: https://issues.apache.org/jira/browse/SPARK-4906 Project: Spark

[jira] [Commented] (SPARK-4896) Don't redundantly copy executor dependencies in Utils.fetchFile

2014-12-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253779#comment-14253779 ] Apache Spark commented on SPARK-4896: - User 'ryan-williams' has created a pull request

[jira] [Updated] (SPARK-4903) RDD remains cached after "DROP TABLE"

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4903: Target Version/s: 1.3.0 > RDD remains cached after "DROP TABLE" > --

[jira] [Commented] (SPARK-4892) java.io.FileNotFound exceptions when creating EXTERNAL hive tables

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253809#comment-14253809 ] Michael Armbrust commented on SPARK-4892: - I'll add that the right fix here is pro

[jira] [Updated] (SPARK-4892) java.io.FileNotFound exceptions when creating EXTERNAL hive tables

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4892: Target Version/s: 1.3.0 > java.io.FileNotFound exceptions when creating EXTERNAL hive tables

[jira] [Updated] (SPARK-4892) java.io.FileNotFound exceptions when creating EXTERNAL hive tables

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4892: Labels: starter (was: ) > java.io.FileNotFound exceptions when creating EXTERNAL hive table

[jira] [Updated] (SPARK-4520) SparkSQL exception when reading certain columns from a parquet file

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4520: Target Version/s: 1.3.0 (was: 1.2.0) > SparkSQL exception when reading certain columns from

[jira] [Updated] (SPARK-4850) "GROUP BY" can't work if the schema of SchemaRDD contains struct or array type

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4850: Description: Code in Spark Shell as follows: {code} val sqlContext = new org.apache.spark.s

[jira] [Updated] (SPARK-4850) "GROUP BY" can't work if the schema of SchemaRDD contains struct or array type

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4850: Assignee: Cheng Lian > "GROUP BY" can't work if the schema of SchemaRDD contains struct or a

[jira] [Updated] (SPARK-4850) "GROUP BY" can't work if the schema of SchemaRDD contains struct or array type

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4850: Target Version/s: 1.3.0 (was: 1.2.0) > "GROUP BY" can't work if the schema of SchemaRDD con

[jira] [Updated] (SPARK-4811) Custom UDTFs not working in Spark SQL

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4811: Target Version/s: 1.3.0 (was: 1.2.0) > Custom UDTFs not working in Spark SQL >

[jira] [Updated] (SPARK-4553) query for parquet table with string fields in spark sql hive get binary result

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4553: Target Version/s: 1.3.0 (was: 1.2.0) > query for parquet table with string fields in spark

[jira] [Updated] (SPARK-3863) Cache broadcasted tables and reuse them across queries

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3863: Target Version/s: 1.3.0 (was: 1.2.0) > Cache broadcasted tables and reuse them across queri

[jira] [Updated] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3862: Target Version/s: 1.3.0 (was: 1.2.0) > MultiWayBroadcastInnerHashJoin > ---

[jira] [Updated] (SPARK-3865) Dimension table broadcast shouldn't be eager

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3865: Target Version/s: 1.3.0 (was: 1.2.0) > Dimension table broadcast shouldn't be eager > -

[jira] [Updated] (SPARK-3864) Specialize join for tables with unique integer keys

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3864: Target Version/s: 1.3.0 (was: 1.2.0) > Specialize join for tables with unique integer keys

[jira] [Commented] (SPARK-4794) Wrong parse of GROUP BY query

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253964#comment-14253964 ] Michael Armbrust commented on SPARK-4794: - Ping. > Wrong parse of GROUP BY query

[jira] [Updated] (SPARK-4904) Remove the foldable checking in HiveGenericUdf.eval

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4904: Target Version/s: 1.3.0 > Remove the foldable checking in HiveGenericUdf.eval >

[jira] [Updated] (SPARK-4689) Unioning 2 SchemaRDDs should return a SchemaRDD in Python, Scala, and Java

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4689: Labels: 1.0.3 (was: ) > Unioning 2 SchemaRDDs should return a SchemaRDD in Python, Scala, a

[jira] [Updated] (SPARK-4801) Add CTE capability to HiveContext

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4801: Description: This is a request to add CTE functionality to HiveContext. Common Table Expre

[jira] [Updated] (SPARK-4801) Add CTE capability to HiveContext

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4801: Target Version/s: 1.3.0 > Add CTE capability to HiveContext > --

[jira] [Resolved] (SPARK-4735) Spark SQL UDF doesn't support 0 arguments.

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4735. - Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Cheng Hao > Spark SQL UDF

[jira] [Created] (SPARK-4907) Inconsistent loss and gradient in LeastSquaresGradient compared with R

2014-12-19 Thread DB Tsai (JIRA)
DB Tsai created SPARK-4907: -- Summary: Inconsistent loss and gradient in LeastSquaresGradient compared with R Key: SPARK-4907 URL: https://issues.apache.org/jira/browse/SPARK-4907 Project: Spark Iss

[jira] [Commented] (SPARK-4907) Inconsistent loss and gradient in LeastSquaresGradient compared with R

2014-12-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253978#comment-14253978 ] Apache Spark commented on SPARK-4907: - User 'dbtsai' has created a pull request for th

[jira] [Commented] (SPARK-4865) rdds exposed to sql context via registerTempTable are not listed via thrift jdbc show tables

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253979#comment-14253979 ] Michael Armbrust commented on SPARK-4865: - Temporary tables are tied to a specific

[jira] [Resolved] (SPARK-4762) Add support for tuples in 'where in' clause query

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4762. - Resolution: Won't Fix This issue can be reopened if the hive parser is ever extended to su

[jira] [Updated] (SPARK-2075) Anonymous classes are missing from Spark distribution

2014-12-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2075: --- Assignee: Shixiong Zhu > Anonymous classes are missing from Spark distribution > -

[jira] [Commented] (SPARK-4865) rdds exposed to sql context via registerTempTable are not listed via thrift jdbc show tables

2014-12-19 Thread Misha Chernetsov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253994#comment-14253994 ] Misha Chernetsov commented on SPARK-4865: - > Or are you creating a JDBC server wit

[jira] [Updated] (SPARK-4636) Cluster By & Distribute By output different with Hive

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4636: Target Version/s: 1.3.0 > Cluster By & Distribute By output different with Hive > --

[jira] [Commented] (SPARK-4589) ML add-ons to SchemaRDD

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254002#comment-14254002 ] Michael Armbrust commented on SPARK-4589: - Can you elaborate what you are thinking

[jira] [Updated] (SPARK-2973) Add a way to show tables without executing a job

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2973: Target Version/s: 1.3.0 (was: 1.2.0) > Add a way to show tables without executing a job > -

[jira] [Commented] (SPARK-2973) Add a way to show tables without executing a job

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254006#comment-14254006 ] Michael Armbrust commented on SPARK-2973: - I think the solution here is to also sp

[jira] [Updated] (SPARK-4865) Include temporary tables in SHOW TABLES

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4865: Summary: Include temporary tables in SHOW TABLES (was: rdds exposed to sql context via regi

[jira] [Updated] (SPARK-4865) Include temporary tables in SHOW TABLES

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4865: Priority: Critical (was: Major) > Include temporary tables in SHOW TABLES > ---

[jira] [Updated] (SPARK-4865) Include temporary tables in SHOW TABLES

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4865: Target Version/s: 1.3.0 > Include temporary tables in SHOW TABLES >

[jira] [Updated] (SPARK-4629) Spark SQL uses Hadoop Configuration in a thread-unsafe manner when writing Parquet files

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4629: Target Version/s: 1.3.0 > Spark SQL uses Hadoop Configuration in a thread-unsafe manner when

[jira] [Updated] (SPARK-4760) "ANALYZE TABLE table COMPUTE STATISTICS noscan" failed estimating table size for tables created from Parquet files

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4760: Target Version/s: 1.3.0 Affects Version/s: (was: 1.3.0) > "ANALYZE TABLE table COMP

[jira] [Updated] (SPARK-4689) Unioning 2 SchemaRDDs should return a SchemaRDD in Python, Scala, and Java

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4689: Labels: starter (was: 1.0.3) > Unioning 2 SchemaRDDs should return a SchemaRDD in Python, S

[jira] [Updated] (SPARK-4760) "ANALYZE TABLE table COMPUTE STATISTICS noscan" failed estimating table size for tables created from Parquet files

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4760: Priority: Critical (was: Major) > "ANALYZE TABLE table COMPUTE STATISTICS noscan" failed es

[jira] [Updated] (SPARK-4689) Unioning 2 SchemaRDDs should return a SchemaRDD in Python, Scala, and Java

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4689: Target Version/s: 1.3.0 > Unioning 2 SchemaRDDs should return a SchemaRDD in Python, Scala,

[jira] [Updated] (SPARK-4648) Support COALESCE function in Spark SQL and HiveQL

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4648: Target Version/s: 1.3.0 Assignee: Ravindra Pesala > Support COALESCE function in

[jira] [Commented] (SPARK-4564) SchemaRDD.groupBy(groupingExprs)(aggregateExprs) doesn't return the groupingExprs as part of the output schema

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254016#comment-14254016 ] Michael Armbrust commented on SPARK-4564: - It is however consistent with SQL, wher

[jira] [Resolved] (SPARK-4564) SchemaRDD.groupBy(groupingExprs)(aggregateExprs) doesn't return the groupingExprs as part of the output schema

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4564. - Resolution: Won't Fix I'm going to close this wontfix unless there is major objection. Ha

[jira] [Updated] (SPARK-4502) Spark SQL reads unneccesary fields from Parquet

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4502: Priority: Critical (was: Major) Target Version/s: 1.3.0 > Spark SQL reads unnec

[jira] [Updated] (SPARK-4476) Use MapType for dict in json which has unique keys in each row.

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4476: Target Version/s: 1.3.0 > Use MapType for dict in json which has unique keys in each row. >

[jira] [Commented] (SPARK-4367) Process the "distinct" value before shuffling for aggregation

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254020#comment-14254020 ] Michael Armbrust commented on SPARK-4367: - So we already do this for SUM and COUNT

[jira] [Resolved] (SPARK-4469) Move the SemanticAnalyzer from Physical Execution to Analysis

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4469. - Resolution: Fixed Assignee: Cheng Hao > Move the SemanticAnalyzer from Physical Exec

[jira] [Updated] (SPARK-4657) Suport storing decimals in Parquet that don't fit in a LONG

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4657: Summary: Suport storing decimals in Parquet that don't fit in a LONG (was: RuntimeException

[jira] [Updated] (SPARK-4657) Suport storing decimals in Parquet that don't fit in a LONG

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4657: Target Version/s: 1.3.0 Issue Type: Improvement (was: Bug) > Suport storing decim

[jira] [Updated] (SPARK-4176) Support decimals with precision > 18 in Parquet

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4176: Target Version/s: 1.3.0 > Support decimals with precision > 18 in Parquet >

[jira] [Resolved] (SPARK-4657) Suport storing decimals in Parquet that don't fit in a LONG

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4657. - Resolution: Duplicate > Suport storing decimals in Parquet that don't fit in a LONG >

[jira] [Updated] (SPARK-4512) Unresolved Attribute Exception for sort by

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4512: Target Version/s: 1.3.0 > Unresolved Attribute Exception for sort by > -

[jira] [Updated] (SPARK-4302) Make jsonRDD/jsonFile support more field data types

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4302: Target Version/s: 1.3.0 > Make jsonRDD/jsonFile support more field data types >

[jira] [Updated] (SPARK-4296) Throw "Expression not in GROUP BY" when using same expression in group by clause and select clause

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4296: Priority: Critical (was: Major) > Throw "Expression not in GROUP BY" when using same expres

[jira] [Updated] (SPARK-4296) Throw "Expression not in GROUP BY" when using same expression in group by clause and select clause

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4296: Target Version/s: 1.3.0 > Throw "Expression not in GROUP BY" when using same expression in g

[jira] [Resolved] (SPARK-4209) Support UDT in UDF

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4209. - Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Michael Armbrust Fixed he

[jira] [Resolved] (SPARK-4201) Can't use concat() on partition column in where condition (Hive compatibility problem)

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4201. - Resolution: Fixed Fix Version/s: 1.2.0 Since this was reported working in master I'

[jira] [Resolved] (SPARK-4135) Error reading Parquet file generated with SparkSQL

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4135. - Resolution: Won't Fix Assignee: Michael Armbrust > Error reading Parquet file genera

[jira] [Commented] (SPARK-4135) Error reading Parquet file generated with SparkSQL

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254035#comment-14254035 ] Michael Armbrust commented on SPARK-4135: - The problem here is you have to columns

[jira] [Resolved] (SPARK-4248) [SQL] spark sql not support add jar

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4248. - Resolution: Fixed Fix Version/s: 1.2.0 > [SQL] spark sql not support add jar > ---

[jira] [Commented] (SPARK-4317) Error querying Avro files imported by Sqoop: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254039#comment-14254039 ] Michael Armbrust commented on SPARK-4317: - Is this still a problem in recent versi

[jira] [Updated] (SPARK-3851) Support for reading parquet files with different but compatible schema

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3851: Priority: Critical (was: Major) Target Version/s: 1.3.0 Issue Type: Im

[jira] [Resolved] (SPARK-3295) [Spark SQL] schemaRdd1 ++ schemaRdd2 does not return another SchemaRdd

2014-12-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-3295. - Resolution: Won't Fix These are actually different operations. UnionAll is similar to the

  1   2   >