[jira] [Updated] (SPARK-3058) Support EXTENDED for EXPLAIN command

2014-08-14 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-3058: - Description: Currently, it's no difference when run the command EXPLAIN w or w/o EXTENDED keywords

RE: [sql]enable spark sql cli support spark sql

2014-08-14 Thread Cheng, Hao
Actually the SQL Parser (another SQL dialect in SparkSQL) is quite weak, and only support some basic queries, not sure what's the plan for its enhancement. -Original Message- From: scwf [mailto:wangf...@huawei.com] Sent: Friday, August 15, 2014 11:22 AM To: dev@spark.apache.org Subject:

RE: Spark SQL Stackoverflow error

2014-08-14 Thread Cheng, Hao
I couldn’t reproduce the exception, probably it’s solved in the latest code. From: Vishal Vibhandik [mailto:vishal.vibhan...@gmail.com] Sent: Thursday, August 14, 2014 11:17 AM To: user@spark.apache.org Subject: Spark SQL Stackoverflow error Hi, I tried running the sample sql code JavaSparkSQL

[jira] [Created] (SPARK-2917) Avoid CTAS creates table in logical plan analyzing.

2014-08-07 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2917: Summary: Avoid CTAS creates table in logical plan analyzing. Key: SPARK-2917 URL: https://issues.apache.org/jira/browse/SPARK-2917 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-2918) EXPLAIN doens't support the native command

2014-08-07 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2918: Summary: EXPLAIN doens't support the native command Key: SPARK-2918 URL: https://issues.apache.org/jira/browse/SPARK-2918 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-2918) EXPLAIN doens't support the native command

2014-08-07 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090226#comment-14090226 ] Cheng Hao edited comment on SPARK-2918 at 8/8/14 2:44 AM

[jira] [Commented] (SPARK-2918) EXPLAIN doens't support the native command

2014-08-07 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090226#comment-14090226 ] Cheng Hao commented on SPARK-2918: -- Usually shouldn't be a problem in a normal SQL query

[jira] [Created] (SPARK-2826) Reduce the Memory Copy for HashOuterJoin

2014-08-04 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2826: Summary: Reduce the Memory Copy for HashOuterJoin Key: SPARK-2826 URL: https://issues.apache.org/jira/browse/SPARK-2826 Project: Spark Issue Type: Improvement

RE: Substring in Spark SQL

2014-08-04 Thread Cheng, Hao
From the log, I noticed the substr was added on July 15th, 1.0.1 release should be earlier than that. Community is now working on releasing the 1.1.0, and also some of the performance improvements were added. Probably you can try that for your benchmark. Cheng Hao -Original Message

[jira] [Created] (SPARK-2767) SparkSQL CLI doens't output error message if query failed.

2014-07-31 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2767: Summary: SparkSQL CLI doens't output error message if query failed. Key: SPARK-2767 URL: https://issues.apache.org/jira/browse/SPARK-2767 Project: Spark Issue Type

RE: SparkSQL can not use SchemaRDD from Hive

2014-07-29 Thread Cheng, Hao
In your code snippet, sample is actually a SchemaRDD, and SchemaRDD actually binds a certain SQLContext in runtime, I don't think we can manipulate/share the SchemaRDD across SQLContext Instances. -Original Message- From: Kevin Jung [mailto:itsjb.j...@samsung.com] Sent: Tuesday, July

[jira] [Created] (SPARK-2665) Add EqualNS support for HiveQL

2014-07-24 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2665: Summary: Add EqualNS support for HiveQL Key: SPARK-2665 URL: https://issues.apache.org/jira/browse/SPARK-2665 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-2663) Support the GroupingSet/ROLLUP/CUBE

2014-07-23 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2663: Summary: Support the GroupingSet/ROLLUP/CUBE Key: SPARK-2663 URL: https://issues.apache.org/jira/browse/SPARK-2663 Project: Spark Issue Type: New Feature

[jira] [Issue Comment Deleted] (SPARK-2663) Support the GroupingSet/ROLLUP/CUBE

2014-07-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2663: - Comment: was deleted (was: https://github.com/apache/spark/pull/1567) Support the GroupingSet/ROLLUP

[jira] [Commented] (SPARK-2663) Support the GroupingSet/ROLLUP/CUBE

2014-07-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072841#comment-14072841 ] Cheng Hao commented on SPARK-2663: -- https://github.com/apache/spark/pull/1567 Support

[jira] [Created] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2615: Summary: Add == support for HiveQl Key: SPARK-2615 URL: https://issues.apache.org/jira/browse/SPARK-2615 Project: Spark Issue Type: Bug Components: SQL

[jira] [Commented] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069948#comment-14069948 ] Cheng Hao commented on SPARK-2615: -- https://github.com/apache/spark/pull/1522 Add

[jira] [Issue Comment Deleted] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2615: - Comment: was deleted (was: https://github.com/apache/spark/pull/1522) Add == support for HiveQl

[jira] [Commented] (SPARK-2615) Add == support for HiveQl

2014-07-22 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071214#comment-14071214 ] Cheng Hao commented on SPARK-2615: -- Yes, that's true. But == is actually used in lots

RE: Joining by timestamp.

2014-07-21 Thread Cheng, Hao
This is a very interesting problem. SparkSQL supports the Non Equi Join, but it is in very low efficiency with large tables. One possible solution is make both table partition based and the partition keys are (cast(ds as bigint) / 240), and with each partition in dataset1, you probably can

RE: Joining by timestamp.

2014-07-21 Thread Cheng, Hao
Actually it's just a pseudo algorithm I described, you can do it with spark API. Hope the algorithm helpful. -Original Message- From: durga [mailto:durgak...@gmail.com] Sent: Tuesday, July 22, 2014 11:56 AM To: u...@spark.incubator.apache.org Subject: RE: Joining by timestamp. Hi Chen,

RE: Joining by timestamp.

2014-07-21 Thread Cheng, Hao
Durga, you can start from the documents http://spark.apache.org/docs/latest/quick-start.html http://spark.apache.org/docs/latest/programming-guide.html -Original Message- From: durga [mailto:durgak...@gmail.com] Sent: Tuesday, July 22, 2014 12:45 PM To:

RE: Hive From Spark

2014-07-20 Thread Cheng, Hao
Subject: RE: Hive From Spark Hi Cheng Hao, Thank you very much for your reply. Basically, the program runs on Spark 1.0.0 and Hive 0.12.0 . Some setups of the environment are done by running SPARK_HIVE=true sbt/sbt assembly/assembly, including the jar in all the workers, and copying the hive

[jira] [Created] (SPARK-2570) ClassCastException from HiveFromSpark(examples)

2014-07-17 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2570: Summary: ClassCastException from HiveFromSpark(examples) Key: SPARK-2570 URL: https://issues.apache.org/jira/browse/SPARK-2570 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2570) ClassCastException from HiveFromSpark(examples)

2014-07-17 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065905#comment-14065905 ] Cheng Hao commented on SPARK-2570: -- https://github.com/apache/spark/pull/1475

[jira] [Created] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2523: Summary: Potential Bugs if SerDe is not the identical among partitions and table Key: SPARK-2523 URL: https://issues.apache.org/jira/browse/SPARK-2523 Project: Spark

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063288#comment-14063288 ] Cheng Hao commented on SPARK-2523: -- This is the follow up for https://github.com/apache

[jira] [Comment Edited] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063288#comment-14063288 ] Cheng Hao edited comment on SPARK-2523 at 7/16/14 8:42 AM

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063289#comment-14063289 ] Cheng Hao commented on SPARK-2523: -- [~yhuai] Can you review the code for me? Potential

[jira] [Created] (SPARK-2540) Add More Types Support for unwarpData of HiveUDF

2014-07-16 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2540: Summary: Add More Types Support for unwarpData of HiveUDF Key: SPARK-2540 URL: https://issues.apache.org/jira/browse/SPARK-2540 Project: Spark Issue Type

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064570#comment-14064570 ] Cheng Hao commented on SPARK-2523: -- sbt/sbt hive/console {code:title=prepare.scala

[jira] [Commented] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-16 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064576#comment-14064576 ] Cheng Hao commented on SPARK-2523: -- I think the root cause is the when ALTER table

[jira] [Commented] (SPARK-2213) Sort Merge Join

2014-06-25 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044279#comment-14044279 ] Cheng Hao commented on SPARK-2213: -- This probably depends on [SPARK-2045|https

[jira] [Commented] (SPARK-2216) Cost-based join reordering

2014-06-20 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038510#comment-14038510 ] Cheng Hao commented on SPARK-2216: -- Yes, this can be a big change, i think we need to add

[jira] [Updated] (SPARK-2215) Multi-way join

2014-06-20 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2215: - Description: Support the multi-way join (multiple table joins) in a single reduce stage if they have

[jira] [Updated] (SPARK-2215) Multi-way join

2014-06-20 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2215: - Description: Support the multi-way join (multiple table joins) in a single reduce stage if they have

[jira] [Created] (SPARK-2212) HashJoin

2014-06-19 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2212: Summary: HashJoin Key: SPARK-2212 URL: https://issues.apache.org/jira/browse/SPARK-2212 Project: Spark Issue Type: Sub-task Reporter: Cheng Hao

[jira] [Created] (SPARK-2211) Join Optimization

2014-06-19 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2211: Summary: Join Optimization Key: SPARK-2211 URL: https://issues.apache.org/jira/browse/SPARK-2211 Project: Spark Issue Type: Improvement Components: SQL

[jira] [Created] (SPARK-2213) Sort Merge Join

2014-06-19 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2213: Summary: Sort Merge Join Key: SPARK-2213 URL: https://issues.apache.org/jira/browse/SPARK-2213 Project: Spark Issue Type: Sub-task Reporter: Cheng Hao

[jira] [Created] (SPARK-2215) Multi-way join

2014-06-19 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2215: Summary: Multi-way join Key: SPARK-2215 URL: https://issues.apache.org/jira/browse/SPARK-2215 Project: Spark Issue Type: Sub-task Components: SQL

[jira] [Commented] (SPARK-2106) Unify the HiveContext

2014-06-11 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028699#comment-14028699 ] Cheng Hao commented on SPARK-2106: -- Oh, I see your point. Actually I was suggesting is we

[jira] [Commented] (SPARK-2106) Unify the HiveContext

2014-06-11 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028705#comment-14028705 ] Cheng Hao commented on SPARK-2106: -- BTW, call hql(...).collect() is a good way

RE: Spark SQL incorrect result on GROUP BY query

2014-06-11 Thread Cheng, Hao
)) sparkContext.makeRDD(rows).registerAsTable(foo) sql(select k,count(*) from foo group by k).collect res1: Array[org.apache.spark.sql.Row] = Array([b,200], [a,100], [c,300]) Cheng Hao From: Pei-Lun Lee [mailto:pl...@appier.com] Sent: Wednesday, June 11, 2014 6:01 PM To: user@spark.apache.org Subject

[jira] [Created] (SPARK-2106) Unify the HiveContext

2014-06-10 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2106: Summary: Unify the HiveContext Key: SPARK-2106 URL: https://issues.apache.org/jira/browse/SPARK-2106 Project: Spark Issue Type: Improvement Components

[jira] [Updated] (SPARK-2106) Unify the HiveContext

2014-06-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2106: - Description: I've been working on CLI for Catalyst, and from the CLI point of view, HiveContext may

[jira] [Updated] (SPARK-2106) Unify the HiveContext

2014-06-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2106: - Description: I've been working on CLI for Catalyst, and from the CLI point of view, HiveContext may

[jira] [Commented] (SPARK-2106) Unify the HiveContext

2014-06-10 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027360#comment-14027360 ] Cheng Hao commented on SPARK-2106: -- [~marmbrus], let me know if you have some input

RE: Is Spark-1.0.0 not backward compatible with Shark-0.9.1 ?

2014-06-10 Thread Cheng, Hao
And if you want to use the SQL CLI (based on catalyst) as it works in Shark, you can also check out https://github.com/amplab/shark/pull/337 :) This preview version doesn’t require the Hive to be setup in the cluster. (Don’t forget to put the hive-site.xml under SHARK_HOME/conf also) Cheng Hao

[jira] [Created] (SPARK-2076) Push Down the Predicate Join Filter for OutJoin

2014-06-08 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-2076: Summary: Push Down the Predicate Join Filter for OutJoin Key: SPARK-2076 URL: https://issues.apache.org/jira/browse/SPARK-2076 Project: Spark Issue Type

[jira] [Updated] (SPARK-2076) Push Down the Predicate Join Filter for OuterJoin

2014-06-08 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-2076: - Summary: Push Down the Predicate Join Filter for OuterJoin (was: Push Down the Predicate Join Filter

[jira] [Commented] (SPARK-2076) Push Down the Predicate Join Filter for OuterJoin

2014-06-08 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021630#comment-14021630 ] Cheng Hao commented on SPARK-2076: -- PR can be found at https://github.com/apache/spark

[jira] [Commented] (SPARK-1461) Support Short-circuit Expression Evaluation

2014-04-18 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973830#comment-13973830 ] Cheng Hao commented on SPARK-1461: -- The PR https://github.com/apache/spark/pull/446

[jira] [Updated] (SPARK-1360) Add Timestamp Support

2014-03-31 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-1360: - Description: Add Timestamp Support for Catalyst/SQLParser/HiveQl (was: Add Timestamp Support for both

[jira] [Updated] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen

2013-07-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-4864: Affects Version/s: 0.9.0 Status: Patch Available (was: Open) Code Comments seems

[jira] [Updated] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen

2013-07-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-4864: Attachment: 1.patch Please check the attachment. Code Comments seems confused between

[jira] [Updated] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen

2013-07-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-4864: Attachment: 2.patch Code Comments seems confused between GenericUDFCase GenericUDFWhen

[jira] [Commented] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen

2013-07-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717873#comment-13717873 ] Cheng Hao commented on HIVE-4864: - Thanks [~ashutoshc], please check the [^2.patch

[jira] [Created] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen

2013-07-15 Thread Cheng Hao (JIRA)
Cheng Hao created HIVE-4864: --- Summary: Code Comments seems confused between GenericUDFCase GenericUDFWhen Key: HIVE-4864 URL: https://issues.apache.org/jira/browse/HIVE-4864 Project: Hive Issue

[jira] [Created] (HIVE-4855) Failed in equality check of PrimitiveTypeInfo after ser/de

2013-07-14 Thread Cheng Hao (JIRA)
Cheng Hao created HIVE-4855: --- Summary: Failed in equality check of PrimitiveTypeInfo after ser/de Key: HIVE-4855 URL: https://issues.apache.org/jira/browse/HIVE-4855 Project: Hive Issue Type: Bug

[jira] [Created] (HIVE-4777) Null value Versus RuntimeException in failed data type converting

2013-06-21 Thread Cheng Hao (JIRA)
Cheng Hao created HIVE-4777: --- Summary: Null value Versus RuntimeException in failed data type converting Key: HIVE-4777 URL: https://issues.apache.org/jira/browse/HIVE-4777 Project: Hive Issue

[jira] [Created] (HIVE-4410) Dummy Storage handler for the select performance benchmarking

2013-04-24 Thread Cheng Hao (JIRA)
Cheng Hao created HIVE-4410: --- Summary: Dummy Storage handler for the select performance benchmarking Key: HIVE-4410 URL: https://issues.apache.org/jira/browse/HIVE-4410 Project: Hive Issue Type

[jira] [Updated] (HIVE-4410) Dummy Storage handler for the select performance benchmarking

2013-04-24 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-4410: Fix Version/s: 0.9.0 Labels: storage-handler (was: ) Status: Patch Available

[jira] [Updated] (HIVE-4410) Dummy Storage handler for the select performance benchmarking

2013-04-24 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-4410: Attachment: dummy.patch Dummy Storage handler for the select performance benchmarking

[jira] [Commented] (HIVE-4410) Dummy Storage handler for the select performance benchmarking

2013-04-24 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640144#comment-13640144 ] Cheng Hao commented on HIVE-4410: - for example: CREATE TABLE dummy_uservisits (sourceIP

[jira] [Created] (HIVE-3823) Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat

2012-12-19 Thread Cheng Hao (JIRA)
Cheng Hao created HIVE-3823: --- Summary: Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat Key: HIVE-3823 URL: https://issues.apache.org/jira/browse/HIVE-3823 Project: Hive

[jira] [Updated] (HIVE-3823) Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat

2012-12-19 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-3823: Description: In HiveHBaseTableInputFormat.java, the Result objects retrieving has performance issue

[jira] [Updated] (HIVE-3823) Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat

2012-12-19 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-3823: Description: In HiveHBaseTableInputFormat.java, the Result objects retrieving has performance issue

[jira] [Updated] (HIVE-3823) Performance issue while retrieving the Result objects in HiveHBaseTableInputFormat

2012-12-19 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HIVE-3823: Description: In HiveHBaseTableInputFormat.java, the Result objects retrieving has performance issue

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-18 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-7381: - Attachment: result_lightweight_copy_v2.patch Thanks, Lars. I didn't notice that in trunk

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-18 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-7381: - Attachment: (was: result_lightweight_copy.patch) Lightweight data transfer for Class Result

[jira] [Commented] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-18 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535676#comment-13535676 ] Cheng Hao commented on HBASE-7381: -- oh, sorry, I will take care of that next time

[jira] [Created] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)
Cheng Hao created HBASE-7381: Summary: Lightweight data transfer for Class Result Key: HBASE-7381 URL: https://issues.apache.org/jira/browse/HBASE-7381 Project: HBase Issue Type: Improvement

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-7381: - Attachment: result_lightweight_copy.patch Provide a new API in Result class

[jira] [Updated] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-7381: - Fix Version/s: 0.94.4 Status: Patch Available (was: Open) Lightweight data transfer

[jira] [Commented] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534679#comment-13534679 ] Cheng Hao commented on HBASE-7381: -- @Yu The following code is from the Hive

[jira] [Commented] (HBASE-7381) Lightweight data transfer for Class Result

2012-12-17 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534688#comment-13534688 ] Cheng Hao commented on HBASE-7381: -- Thanks @Yu Once this patch is applied, I will create

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-05 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490649#comment-13490649 ] Cheng Hao commented on HBASE-6852: -- @Lars, thank you for the committing; The snapshot

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: 6852-0.94_3.patch Lars, Ted, It did have a bug in the v2 patch, please take the 6852

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-02 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489861#comment-13489861 ] Cheng Hao commented on HBASE-6852: -- Ouch!Still failed,and I still couldn't access

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488549#comment-13488549 ] Cheng Hao commented on HBASE-6852: -- Still failed, And I can not open the URL https

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489176#comment-13489176 ] Cheng Hao commented on HBASE-6852: -- Thanks Lars and Ted, I will try to reproduce

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-31 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: 6852-0.94_2.patch Please take the 6852-0.94_2.patch Found a small bug while updating

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-29 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486068#comment-13486068 ] Cheng Hao commented on HBASE-6852: -- oh, sorry for that, I will resolve it asap

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: (was: AtomicTest.java) SchemaMetrics.updateOnCacheHit costs too much while full

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: metrics_hotspots.png Sample callgraph via visualvm, seems the bottleneck

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-15 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476695#comment-13476695 ] Cheng Hao commented on HBASE-6852: -- Sorry, just read an article, the self time may

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: AtomicTest.java I tested the AtomicLong, Counter, and normal function call, and the result

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471150#comment-13471150 ] Cheng Hao commented on HBASE-6852: -- I re-ran the scanning tests, with or without

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-10-06 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471151#comment-13471151 ] Cheng Hao commented on HBASE-6852: -- Sorry, please check the AtomicTest.java attached

[jira] [Updated] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-10-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6805: - Attachment: (was: extend_coprocessor.patch) Extend co-processor framework to provide observers

[jira] [Updated] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-10-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6805: - Attachment: extend_coprocessor.patch Extend co-processor framework to provide observers for filter

[jira] [Commented] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-10-01 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467428#comment-13467428 ] Cheng Hao commented on HBASE-6805: -- Thank you Andrew for the clarity. I added unit test

[jira] [Updated] (HBASE-6805) Extend co-processor framework to provide observers for filter operations

2012-09-27 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6805: - Attachment: extend_coprocessor.patch Please check the patch attached. Hope it make more sense

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-27 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465357#comment-13465357 ] Cheng Hao commented on HBASE-6852: -- Hi, stack, the patch does improve the performance

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-23 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461559#comment-13461559 ] Cheng Hao commented on HBASE-6852: -- {quote} Cheng Hao: you said that your dataset size

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460280#comment-13460280 ] Cheng Hao commented on HBASE-6852: -- Lars, the only place to use the ConcurentMap

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460297#comment-13460297 ] Cheng Hao commented on HBASE-6852: -- Hi Liang, it's really good suggestion. Actually I

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: onhitcache-trunk.patch change the THRESHOLD_METRICS_FLUSH from 2000 to 100, per Lars

[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: (was: onhitcache-trunk.patch) SchemaMetrics.updateOnCacheHit costs too much while

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-21 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13460316#comment-13460316 ] Cheng Hao commented on HBASE-6852: -- I didn't remove the cacheHits in the HFileReaderV1

<    1   2   3   4   5   6   7   >