[jira] [Created] (SPARK-4744) Short Circuit evaluation for AND OR in code gen

2014-12-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4744: Summary: Short Circuit evaluation for AND OR in code gen Key: SPARK-4744 URL: https://issues.apache.org/jira/browse/SPARK-4744 Project: Spark Issue Type

[jira] [Created] (SPARK-4735) Spark SQL UDF doesn't support 0 arguments.

2014-12-03 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4735: Summary: Spark SQL UDF doesn't support 0 arguments. Key: SPARK-4735 URL: https://issues.apache.org/jira/browse/SPARK-4735 Project: Spark Issue Type: Bug

RE: Spark SQL UDF returning a list?

2014-12-03 Thread Cheng, Hao
/pull/3595 ) b. It expects the function return type to be immutable.Seq[XX] for List, immutable.Map[X, X] for Map, scala.Product for Struct, and only Array[Byte] for binary. The Array[_] is not supported. Cheng Hao From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Thursday, December 4

RE: Spark SQL with a sorted file

2014-12-03 Thread Cheng, Hao
You can try to write your own Relation with filter push down or use the ParquetRelation2 for workaround. (https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala) Cheng Hao -Original Message- From: Jerry Raj [mailto:jerry

[jira] [Created] (SPARK-4713) SchemaRDD.unpersist() should not raise exception if it is not cached.

2014-12-02 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4713: Summary: SchemaRDD.unpersist() should not raise exception if it is not cached. Key: SPARK-4713 URL: https://issues.apache.org/jira/browse/SPARK-4713 Project: Spark

[jira] [Commented] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-02 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231670#comment-14231670 ] Cheng Hao commented on HIVE-9004: - [~namit] sorry, I am not sure the review process, can

[jira] [Created] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)
Cheng Hao created HIVE-9004: --- Summary: Reset doesn't work for the default empty value entry Key: HIVE-9004 URL: https://issues.apache.org/jira/browse/HIVE-9004 Project: Hive Issue Type: Bug

[jira] [Assigned] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao reassigned HIVE-9004: --- Assignee: Cheng Hao Reset doesn't work for the default empty value entry

[jira] [Updated] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-9004: Attachment: reset.patch Reset doesn't work for the default empty value entry

[jira] [Updated] (HIVE-9004) Reset doesn't work for the default empty value entry

2014-12-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-9004: Fix Version/s: 0.14.1 0.15.0 spark-branch Status: Patch

[jira] [Created] (SPARK-4662) Whitelist more Hive unittest

2014-11-30 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4662: Summary: Whitelist more Hive unittest Key: SPARK-4662 URL: https://issues.apache.org/jira/browse/SPARK-4662 Project: Spark Issue Type: Bug Components: SQL

[jira] [Created] (SPARK-4636) Cluster By Distribute By output different with Hive

2014-11-27 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4636: Summary: Cluster By Distribute By output different with Hive Key: SPARK-4636 URL: https://issues.apache.org/jira/browse/SPARK-4636 Project: Spark Issue Type: Bug

RE: Auto BroadcastJoin optimization failed in latest Spark

2014-11-27 Thread Cheng, Hao
From: Jianshi Huang [mailto:jianshi.hu...@gmail.com] Sent: Thursday, November 27, 2014 10:24 PM To: Cheng, Hao Cc: user Subject: Re: Auto BroadcastJoin optimization failed in latest Spark Hi Hao, I'm using inner join as Broadcast join didn't work for left joins (thanks for the links

[jira] [Created] (SPARK-4625) Support Sort By in both DSL SimpleSQLParser

2014-11-26 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4625: Summary: Support Sort By in both DSL SimpleSQLParser Key: SPARK-4625 URL: https://issues.apache.org/jira/browse/SPARK-4625 Project: Spark Issue Type: New Feature

RE: Auto BroadcastJoin optimization failed in latest Spark

2014-11-26 Thread Cheng, Hao
Are all of your join keys the same? and I guess the join type are all “Left” join, https://github.com/apache/spark/pull/3362 probably is what you need. And, SparkSQL doesn’t support the multiway-join (and multiway-broadcast join) currently, https://github.com/apache/spark/pull/3270 should be

RE: Spark SQL performance and data size constraints

2014-11-26 Thread Cheng, Hao
Spark SQL doesn't support the DISTINCT well currently, particularly the case you described, it will leads all of the data fall into a single node and keep them in memory only. Dev community actually has solutions for this, it probably will be solved after the release of Spark 1.2.

[jira] [Created] (SPARK-4573) Support SettableStructObjectInspector for function wrap in HiveObjectInspectors

2014-11-24 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4573: Summary: Support SettableStructObjectInspector for function wrap in HiveObjectInspectors Key: SPARK-4573 URL: https://issues.apache.org/jira/browse/SPARK-4573 Project: Spark

[jira] [Commented] (SPARK-4573) Support SettableStructObjectInspector for function wrap in HiveObjectInspectors

2014-11-24 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222961#comment-14222961 ] Cheng Hao commented on SPARK-4573: -- HIVE UDAF needs SettableStructObjectInspector

RE: SparkSQL Timestamp query failure

2014-11-23 Thread Cheng, Hao
Can you try query like “SELECT timestamp, CAST(timestamp as string) FROM logs LIMIT 5”, I guess you probably ran into the timestamp precision or the timezone shifting problem. (And it’s not mandatory, but you’d better change the field name from “timestamp” to something else, as “timestamp” is

[jira] [Created] (SPARK-4512) Unresolved Attribute Exception for sort by

2014-11-20 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4512: Summary: Unresolved Attribute Exception for sort by Key: SPARK-4512 URL: https://issues.apache.org/jira/browse/SPARK-4512 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-2918) EXPLAIN doens't support the CTAS

2014-11-18 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2918: - Summary: EXPLAIN doens't support the CTAS (was: EXPLAIN doens't support the native command) EXPLAIN

[jira] [Created] (SPARK-4448) Support ConstantObjectInspector for unwrapping data

2014-11-17 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4448: Summary: Support ConstantObjectInspector for unwrapping data Key: SPARK-4448 URL: https://issues.apache.org/jira/browse/SPARK-4448 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-4469) Move the SemanticAnalyzer from Physical Execution to Analysis

2014-11-17 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4469: Summary: Move the SemanticAnalyzer from Physical Execution to Analysis Key: SPARK-4469 URL: https://issues.apache.org/jira/browse/SPARK-4469 Project: Spark Issue

[jira] [Created] (SPARK-4366) Aggregation Optimization

2014-11-12 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4366: Summary: Aggregation Optimization Key: SPARK-4366 URL: https://issues.apache.org/jira/browse/SPARK-4366 Project: Spark Issue Type: Improvement Components

[jira] [Created] (SPARK-4367) Process the distinct value before shuffling for aggregation

2014-11-12 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4367: Summary: Process the distinct value before shuffling for aggregation Key: SPARK-4367 URL: https://issues.apache.org/jira/browse/SPARK-4367 Project: Spark Issue

[jira] [Updated] (SPARK-4233) Simplify the Aggregation Function implementation

2014-11-12 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-4233: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-4366 Simplify the Aggregation Function

[jira] [Updated] (SPARK-4367) Process the distinct value before shuffling for aggregation

2014-11-12 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-4367: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-4366 Process the distinct value before

[jira] [Updated] (SPARK-3056) Sort-based Aggregation

2014-11-12 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3056: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-4366 Sort-based Aggregation

[jira] [Updated] (SPARK-4274) NEP in printing the details of query plan

2014-11-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-4274: - Summary: NEP in printing the details of query plan (was: Hive comparison test framework doesn't print

[jira] [Updated] (SPARK-4274) NEP in printing the details of query plan

2014-11-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-4274: - Description: NEP in printing the details of query plan, if the query is not valid. this will great

[jira] [Updated] (SPARK-4274) NPE in printing the details of query plan

2014-11-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-4274: - Summary: NPE in printing the details of query plan (was: NEP in printing the details of query plan

[jira] [Updated] (SPARK-4274) NPE in printing the details of query plan

2014-11-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-4274: - Description: NPE in printing the details of query plan, if the query is not valid. This will be great

[jira] [Created] (SPARK-4272) Add more unwrap functions for primitive type in TableReader

2014-11-06 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4272: Summary: Add more unwrap functions for primitive type in TableReader Key: SPARK-4272 URL: https://issues.apache.org/jira/browse/SPARK-4272 Project: Spark Issue

[jira] [Created] (SPARK-4274) Hive comparison test framework doesn't print effective information while logical plan analyzing failed

2014-11-06 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4274: Summary: Hive comparison test framework doesn't print effective information while logical plan analyzing failed Key: SPARK-4274 URL: https://issues.apache.org/jira/browse/SPARK-4274

[jira] [Created] (SPARK-4244) ConstantFolding has to be done before initialize the Generic UDF

2014-11-05 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4244: Summary: ConstantFolding has to be done before initialize the Generic UDF Key: SPARK-4244 URL: https://issues.apache.org/jira/browse/SPARK-4244 Project: Spark

[jira] [Created] (SPARK-4250) Create constant null value for Hive Inspectors

2014-11-05 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4250: Summary: Create constant null value for Hive Inspectors Key: SPARK-4250 URL: https://issues.apache.org/jira/browse/SPARK-4250 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-4234) Always do paritial aggregation

2014-11-05 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199457#comment-14199457 ] Cheng Hao commented on SPARK-4234: -- Yes, it looks like that, but we probably need

[jira] [Created] (SPARK-4263) PERCENTILE is not working

2014-11-05 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4263: Summary: PERCENTILE is not working Key: SPARK-4263 URL: https://issues.apache.org/jira/browse/SPARK-4263 Project: Spark Issue Type: Bug Components: SQL

[jira] [Commented] (SPARK-4263) PERCENTILE is not working

2014-11-05 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199747#comment-14199747 ] Cheng Hao commented on SPARK-4263: -- Oh, yes, my bad. Please update it. PERCENTILE

[jira] [Commented] (SPARK-4263) PERCENTILE is not working

2014-11-05 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199749#comment-14199749 ] Cheng Hao commented on SPARK-4263: -- [~gvramana] yes, you PR should fix this, I will mark

[jira] [Resolved] (SPARK-4263) PERCENTILE is not working

2014-11-05 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao resolved SPARK-4263. -- Resolution: Duplicate PERCENTILE is not working - Key

RE: [VOTE] Designating maintainers for some Spark components

2014-11-05 Thread Cheng, Hao
+1, that definitely will speeds up the PR reviewing / merging. -Original Message- From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Thursday, November 6, 2014 12:46 PM To: dev Subject: Re: [VOTE] Designating maintainers for some Spark components +1 since this is already the de facto

Spark SQL Hive Version

2014-11-05 Thread Cheng, Hao
Hi, all, I noticed that when compiling the SparkSQL with profile hive-0.13.1, it will fetch the Hive version of 0.13.1a under groupId org.spark-project.hive, what's the difference with the one of org.apache.hive? And where can I get the source code for re-compiling? Thanks, Cheng Hao

RE: [SQL] PERCENTILE is not working

2014-11-05 Thread Cheng, Hao
Which version are you using? I can reproduce that in the latest code, but with different exception. I've filed an bug https://issues.apache.org/jira/browse/SPARK-4263, can you also add some information there? Thanks, Cheng Hao -Original Message- From: Kevin Paul [mailto:kevinpaulap

[jira] [Created] (SPARK-4233) Simplify the Aggregation Function implementation

2014-11-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4233: Summary: Simplify the Aggregation Function implementation Key: SPARK-4233 URL: https://issues.apache.org/jira/browse/SPARK-4233 Project: Spark Issue Type

[jira] [Created] (SPARK-4234) Always do paritial aggregation

2014-11-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4234: Summary: Always do paritial aggregation Key: SPARK-4234 URL: https://issues.apache.org/jira/browse/SPARK-4234 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-4235) Add union data type support

2014-11-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4235: Summary: Add union data type support Key: SPARK-4235 URL: https://issues.apache.org/jira/browse/SPARK-4235 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-4152) Avoid data change in CTAS while table already existed

2014-10-30 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4152: Summary: Avoid data change in CTAS while table already existed Key: SPARK-4152 URL: https://issues.apache.org/jira/browse/SPARK-4152 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-4143) Move inner class DeferredObjectAdapter to top level

2014-10-29 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4143: Summary: Move inner class DeferredObjectAdapter to top level Key: SPARK-4143 URL: https://issues.apache.org/jira/browse/SPARK-4143 Project: Spark Issue Type

RE: Build with Hive 0.13.1 doesn't have datanucleus and parquet dependencies.

2014-10-27 Thread Cheng, Hao
Hive-thriftserver module is not included while specifying the profile hive-0.13.1. -Original Message- From: Jianshi Huang [mailto:jianshi.hu...@gmail.com] Sent: Monday, October 27, 2014 4:48 PM To: dev@spark.apache.org Subject: Build with Hive 0.13.1 doesn't have datanucleus and

Support Hive 0.13 .1 in Spark SQL

2014-10-27 Thread Cheng, Hao
. Sorry if I missed some discussion of Hive upgrading. Cheng Hao

[jira] [Created] (SPARK-4093) Simplify the unwrap/wrap between HiveUDFs

2014-10-26 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-4093: Summary: Simplify the unwrap/wrap between HiveUDFs Key: SPARK-4093 URL: https://issues.apache.org/jira/browse/SPARK-4093 Project: Spark Issue Type: Improvement

RE: Create table error from Hive in spark-assembly-1.0.2.jar

2014-10-26 Thread Cheng, Hao
Can you paste the hive-site.xml? Most of times I meet this exception, because the JDBC driver for hive metastore are not correct set or wrong driver classes are included in the assembly jar. As default, the assembly jar contains the derby.jar, which is the embedded derby JDBC driver. From:

[jira] [Updated] (SPARK-2663) Support the GroupingSet/ROLLUP/CUBE

2014-10-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2663: - Attachment: grouping_set.pdf General Design for the implementation of GroupingSet, Cube, Rollup

RE: SchemaRDD Convert

2014-10-22 Thread Cheng, Hao
You needn't do anything, the implicit conversion should do this for you. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L103

RE: spark sql query optimization , and decision tree building

2014-10-22 Thread Cheng, Hao
not sure about how kd tree used in mllib. but keep in mind SchemaRDD is just a normal RDD. Cheng Hao From: sanath kumar [mailto:sanath1...@gmail.com] Sent: Wednesday, October 22, 2014 12:58 PM To: user@spark.apache.org Subject: spark sql query optimization , and decision tree building Hi all

RE: scala.MatchError: class java.sql.Timestamp

2014-10-19 Thread Cheng, Hao
Seems bugs in the JavaSQLContext.getSchema(), which doesn't enumerate all of the data types supported by Catalyst. From: Ge, Yao (Y.) [mailto:y...@ford.com] Sent: Sunday, October 19, 2014 11:44 PM To: Wang, Daoyuan; user@spark.apache.org Subject: RE: scala.MatchError: class java.sql.Timestamp

RE: Spark SQL parser bug?

2014-10-12 Thread Cheng, Hao
(1::2::Nil).map(i= T(i.toString, new java.sql.Timestamp(i))) data.registerTempTable(x) val s = sqlContext.sql(select a from x where ts='1970-01-01 00:00:00';) s.collect output: res1: Array[org.apache.spark.sql.Row] = Array([1], [2]) Cheng Hao From: Mohammed Guller [mailto:moham

[jira] [Created] (SPARK-3904) HQL doesn't support the ConstantObjectInspector to pass into UDFs

2014-10-11 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3904: Summary: HQL doesn't support the ConstantObjectInspector to pass into UDFs Key: SPARK-3904 URL: https://issues.apache.org/jira/browse/SPARK-3904 Project: Spark

[jira] [Created] (SPARK-3911) HiveSimpleUdf can not be optimized in constant folding

2014-10-11 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3911: Summary: HiveSimpleUdf can not be optimized in constant folding Key: SPARK-3911 URL: https://issues.apache.org/jira/browse/SPARK-3911 Project: Spark Issue Type

[jira] [Created] (SPARK-3739) Too many splits for small source file in table scanning

2014-09-29 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3739: Summary: Too many splits for small source file in table scanning Key: SPARK-3739 URL: https://issues.apache.org/jira/browse/SPARK-3739 Project: Spark Issue Type

[jira] [Created] (SPARK-3707) Type Coercion for DIV doesn't work for non-numeric argument

2014-09-27 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3707: Summary: Type Coercion for DIV doesn't work for non-numeric argument Key: SPARK-3707 URL: https://issues.apache.org/jira/browse/SPARK-3707 Project: Spark Issue

RE: problem with HiveContext inside Actor

2014-09-17 Thread Cheng, Hao
the null value when retrieving HiveConf. Cheng Hao From: Du Li [mailto:l...@yahoo-inc.com.INVALID] Sent: Thursday, September 18, 2014 7:51 AM To: user@spark.apache.org; d...@spark.apache.org Subject: problem with HiveContext inside Actor Hi, Wonder anybody had similar experience or any suggestion here

RE: SparkSQL 1.1 hang when DROP or LOAD

2014-09-16 Thread Cheng, Hao
Thank you for pasting the steps, I will look at this, hopefully come out with a solution soon. -Original Message- From: linkpatrickliu [mailto:linkpatrick...@live.com] Sent: Tuesday, September 16, 2014 3:17 PM To: u...@spark.incubator.apache.org Subject: RE: SparkSQL 1.1 hang when DROP

RE: SparkSQL 1.1 hang when DROP or LOAD

2014-09-16 Thread Cheng, Hao
is working on upgrading the Hive to 0.13 for SparkSQL (https://github.com/apache/spark/pull/2241), not sure if you can wait for this. ☺ From: Yin Huai [mailto:huaiyin@gmail.com] Sent: Wednesday, September 17, 2014 1:50 AM To: Cheng, Hao Cc: linkpatrickliu; u...@spark.incubator.apache.org Subject

[jira] [Created] (SPARK-3527) Strip the physical plan message margin

2014-09-15 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3527: Summary: Strip the physical plan message margin Key: SPARK-3527 URL: https://issues.apache.org/jira/browse/SPARK-3527 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-3529) Delete the temporal files after test exit

2014-09-15 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3529: Summary: Delete the temporal files after test exit Key: SPARK-3529 URL: https://issues.apache.org/jira/browse/SPARK-3529 Project: Spark Issue Type: Improvement

RE: SparkSQL 1.1 hang when DROP or LOAD

2014-09-15 Thread Cheng, Hao
What's your Spark / Hadoop version? And also the hive-site.xml? Most of case like that caused by incompatible Hadoop client jar and the Hadoop cluster. -Original Message- From: linkpatrickliu [mailto:linkpatrick...@live.com] Sent: Monday, September 15, 2014 2:35 PM To:

RE: SparkSQL 1.1 hang when DROP or LOAD

2014-09-15 Thread Cheng, Hao
The Hadoop client jar should be assembled into the uber-jar, but (I suspect) it's probably not compatible with your Hadoop Cluster. Can you also paste the Spark uber-jar name? Usually will be under the path lib/spark-assembly-1.1.0-xxx-hadoopxxx.jar. -Original Message- From:

RE: SparkSQL 1.1 hang when DROP or LOAD

2014-09-15 Thread Cheng, Hao
Sorry, I am not able to reproduce that. Can you try add the following entry into the hive-site.xml? I know they have the default value, but let's make it explicitly. hive.server2.thrift.port hive.server2.thrift.bind.host hive.server2.authentication (NONE、KERBEROS、LDAP、PAM or CUSTOM)

[jira] [Updated] (SPARK-3393) Align the log4j configuration for Spark SparkSQLCLI

2014-09-12 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3393: - Summary: Align the log4j configuration for Spark SparkSQLCLI (was: Add configuration templates for HQL

[jira] [Created] (SPARK-3501) Hive SimpleUDF will create duplicated type cast which cause exception in constant folding

2014-09-11 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3501: Summary: Hive SimpleUDF will create duplicated type cast which cause exception in constant folding Key: SPARK-3501 URL: https://issues.apache.org/jira/browse/SPARK-3501

RE: Spark SQL JDBC

2014-09-11 Thread Cheng, Hao
I copied the 3 datanucleus jars (datanucleus-api-jdo-3.2.1.jar, datanucleus-core-3.2.2.jar, datanucleus-rdbms-3.2.1.jar) to the fold lib/ manually, and it works for me. From: Denny Lee [mailto:denny.g@gmail.com] Sent: Friday, September 12, 2014 11:28 AM To: alexandria1101 Cc:

[jira] [Created] (SPARK-3455) **HotFix** Unit test failed due to can not resolve the attribute references

2014-09-09 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3455: Summary: **HotFix** Unit test failed due to can not resolve the attribute references Key: SPARK-3455 URL: https://issues.apache.org/jira/browse/SPARK-3455 Project: Spark

[jira] [Created] (SPARK-3412) Add Missing Types for Row API

2014-09-05 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3412: Summary: Add Missing Types for Row API Key: SPARK-3412 URL: https://issues.apache.org/jira/browse/SPARK-3412 Project: Spark Issue Type: Bug Components

[jira] [Created] (SPARK-3407) Add Date type support

2014-09-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3407: Summary: Add Date type support Key: SPARK-3407 URL: https://issues.apache.org/jira/browse/SPARK-3407 Project: Spark Issue Type: Improvement Components

RE: SchemaRDD - Parquet - insertInto makes many files

2014-09-04 Thread Cheng, Hao
Hive can launch another job with strategy to merged the small files, probably we can also do that in the future release. From: Michael Armbrust [mailto:mich...@databricks.com] Sent: Friday, September 05, 2014 8:59 AM To: DanteSama Cc: u...@spark.incubator.apache.org Subject: Re: SchemaRDD -

[jira] [Created] (SPARK-3392) Set command always get undefined for key mapred.reduce.tasks

2014-09-03 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3392: Summary: Set command always get undefined for key mapred.reduce.tasks Key: SPARK-3392 URL: https://issues.apache.org/jira/browse/SPARK-3392 Project: Spark Issue

[jira] [Created] (SPARK-3393) Add configuration template for HQL user

2014-09-03 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3393: Summary: Add configuration template for HQL user Key: SPARK-3393 URL: https://issues.apache.org/jira/browse/SPARK-3393 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-3393) Add configuration templates for HQL user

2014-09-03 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3393: - Summary: Add configuration templates for HQL user (was: Add configuration template for HQL user) Add

[jira] [Commented] (SPARK-3343) Support for CREATE TABLE AS SELECT that specifies the format

2014-09-02 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119187#comment-14119187 ] Cheng Hao commented on SPARK-3343: -- Actually I was planning to do in after https

[jira] [Commented] (SPARK-3343) Support for CREATE TABLE AS SELECT that specifies the format

2014-09-02 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119190#comment-14119190 ] Cheng Hao commented on SPARK-3343: -- And probably also depends on https://github.com

RE: Unsupported language features in query

2014-09-02 Thread Cheng, Hao
Currently SparkSQL doesn’t support the row format/serde in CTAS. The work around is create the table first. -Original Message- From: centerqi hu [mailto:cente...@gmail.com] Sent: Tuesday, September 02, 2014 3:35 PM To: user@spark.apache.org Subject: Unsupported language features in

RE: Unsupported language features in query

2014-09-02 Thread Cheng, Hao
[mailto:cente...@gmail.com] Sent: Tuesday, September 02, 2014 3:46 PM To: Cheng, Hao Cc: user@spark.apache.org Subject: Re: Unsupported language features in query Thanks Cheng Hao Have a way of obtaining spark support hive statement list? Thanks 2014-09-02 15:39 GMT+08:00 Cheng, Hao hao.ch

RE: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-31 Thread Cheng, Hao
Yes, the root cause for that is the output ObjectInspector in SerDe implementation doesn't reflect the real typeinfo. Hive actually provides the API like TypeInfoUtils.getStandardJavaObjectInspectorFromTypeInfo(TypeInfo) for the mapping. You probably need to update the code at

[jira] [Updated] (SPARK-3198) Remove the id property from the TreeNode API

2014-08-26 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3198: - Description: Remove the id property of TreeNode API, since the id generation is kind of performance

[jira] [Updated] (SPARK-3196) Expression Evaluation Performance Improvement

2014-08-26 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3196: - Description: The expression id generations depend on a atomic long object internally, which will cause

[jira] [Created] (SPARK-3197) Reduce the expression tree object creation from the aggregation functions (min/max)

2014-08-25 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3197: Summary: Reduce the expression tree object creation from the aggregation functions (min/max) Key: SPARK-3197 URL: https://issues.apache.org/jira/browse/SPARK-3197 Project

[jira] [Created] (SPARK-3196) Expression Evaluation Performance Improvement

2014-08-25 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3196: Summary: Expression Evaluation Performance Improvement Key: SPARK-3196 URL: https://issues.apache.org/jira/browse/SPARK-3196 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-3198) Improve the expression id generation algorithm

2014-08-25 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3198: Summary: Improve the expression id generation algorithm Key: SPARK-3198 URL: https://issues.apache.org/jira/browse/SPARK-3198 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-3198) Generates the expression id while necessary

2014-08-25 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3198: - Summary: Generates the expression id while necessary (was: Improve the expression id generation

[jira] [Updated] (SPARK-3196) Expression Evaluation Performance Improvement

2014-08-25 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3196: - Description: The expression id generations depend on a atomic long object internally, which will cause

[jira] [Commented] (SPARK-3198) Generates the expression id while necessary

2014-08-25 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108894#comment-14108894 ] Cheng Hao commented on SPARK-3198: -- Usually, we need the expression id in logical plan

[jira] [Commented] (SPARK-3124) Jar version conflict in the assembly package

2014-08-19 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102338#comment-14102338 ] Cheng Hao commented on SPARK-3124: -- Can you try bin/spark-sql after make distribution

[jira] [Commented] (SPARK-3124) Jar version conflict in the assembly package

2014-08-19 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102352#comment-14102352 ] Cheng Hao commented on SPARK-3124: -- Yes, actually I did in the PR. Jar version conflict

RE: [sql]enable spark sql cli support spark sql

2014-08-15 Thread Cheng, Hao
If so, probably we need to add the SQL dialects switching support for SparkSQLCLI, as Fei suggested. What do you think the priority for this? -Original Message- From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Friday, August 15, 2014 1:57 PM To: Cheng, Hao Cc: scwf; dev

[jira] [Commented] (SPARK-2213) Sort Merge Join

2014-08-14 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097995#comment-14097995 ] Cheng Hao commented on SPARK-2213: -- Sort Merge Join depends on the reduce side sort merge

[jira] [Created] (SPARK-3056) Sort-based Aggregation

2014-08-14 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3056: Summary: Sort-based Aggregation Key: SPARK-3056 URL: https://issues.apache.org/jira/browse/SPARK-3056 Project: Spark Issue Type: Improvement Components

[jira] [Commented] (SPARK-3056) Sort-based Aggregation

2014-08-14 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097997#comment-14097997 ] Cheng Hao commented on SPARK-3056: -- SPARK-2926 provides external sort in reduce side

[jira] [Created] (SPARK-3058) Support EXTENDED for EXPLAIN command

2014-08-14 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-3058: Summary: Support EXTENDED for EXPLAIN command Key: SPARK-3058 URL: https://issues.apache.org/jira/browse/SPARK-3058 Project: Spark Issue Type: Improvement

<    1   2   3   4   5   6   7   >