[GitHub] spark pull request: [SPARK-12789]Support order by index and group ...

2016-03-21 Thread zhichao-li
Github user zhichao-li closed the pull request at: https://github.com/apache/spark/pull/10731 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2016-02-29 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9483#issuecomment-190497155 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2016-02-28 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9483#issuecomment-190016890 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-8813][SQL]Combine splits by size

2016-02-24 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-188600562 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-12820][SQL]Resolve db.table.column

2016-02-24 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10753#issuecomment-188582658 @chenghao-intel , mysql(5.5) doesn't support nested data type for now, and hive(1.2.1) doesn't support the usage of "db.table.field" in pro

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2016-02-24 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9483#issuecomment-188561642 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-8813][SQL]Combine splits by size

2016-02-23 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-188129892 retest this please. seems like it's not related to this pr: `java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.JoinedRow cann

[GitHub] spark pull request: [SPARK-8813][SQL]Combine splits by size

2016-02-23 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-188033437 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8813][SQL]Combine splits by size

2016-02-17 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/9097#discussion_r53276223 --- Diff: sql/hive/src/main/java/org/apache/spark/sql/hive/mapred/CombineSplitInputFormat.java --- @@ -0,0 +1,110 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-12789]Support order by index and group ...

2016-02-05 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r52096762 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -322,6 +323,62 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12789]Support order by index and group ...

2016-02-05 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r52093518 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -456,21 +455,32 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request: [SPARK-12789]Support order by index and group ...

2016-02-05 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r52092731 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -322,6 +323,62 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-02-05 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-180249809 Sure, would provide more description and comments later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-02-04 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r51985837 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -322,6 +323,62 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-02-04 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r51985716 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -456,21 +455,32 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-02-04 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r51985561 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -322,6 +323,62 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-02-01 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-178273065 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-02-01 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-178271829 @yhuai any more comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-24 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r50653035 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -446,6 +503,10 @@ class Analyzer( val

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-20 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-173474664 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-18 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172736625 @hvanhovell didn't aware of #10052, would be happy if @dereksabryfb can pick up that. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-18 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r50072670 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -445,6 +445,26 @@ class Analyzer( val

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-18 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r50072543 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala --- @@ -17,6 +17,7 @@ package

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2016-01-17 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9483#issuecomment-172447842 @yhuai @rxin , any thoughts or concerns for this PR? It's common that one table contains tons of partitions(i.e every 15mins a partition for clicking data). -

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-17 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r49967047 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -445,6 +445,26 @@ class Analyzer( val

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-17 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r49962166 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -438,11 +438,25 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-17 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172424905 yes, It's a 1-based indexing for the projection list. --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark pull request: [SPARK-12820][SQL]Resolve db.table.column

2016-01-13 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10753#issuecomment-171546245 Would return attribute b in table a. resolved within the logic of `table.col` --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-12820][SQL]Resolve db.table.column

2016-01-13 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10753#issuecomment-171544562 It would return attribute c in table b up on this patch. Any suggestion? throw ambiguous exception for such case?  --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-13 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r49689992 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -441,7 +442,7 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12820]Resolve db.table.column

2016-01-13 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/10753 [SPARK-12820]Resolve db.table.column Currently spark only support to specify col name like: `table.col`, or `col` in projection, but it's very common that user use `db.table.col` espec

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-13 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-171486258 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-12 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r49540626 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -441,7 +442,7 @@ class Analyzer

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-12 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r49540331 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -22,8 +22,9 @@ import java.sql.Timestamp import

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-12 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-171113667 cc @chenghao-intel @adrian-wang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-12789]Support order by index

2016-01-12 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/10731 [SPARK-12789]Support order by index Num in Order by is treated as constant expression at the moment. I guess it would be good to enable user to specify column by index which has been supported

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-12-23 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8915#issuecomment-167058985 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11624][SPARK-11972][SQL]fix commands th...

2015-12-23 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/9589#discussion_r48398910 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala --- @@ -151,29 +152,34 @@ private[hive] class ClientWrapper

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-12-21 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8915#issuecomment-166544975 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-12-21 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8915#issuecomment-166502389 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-12-17 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8915#issuecomment-165699875 @andrewor14 @liancheng , These options like "-S" "-v" still be discarded since they are all stored within SessionState which would be reconstru

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2015-11-18 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/9483#discussion_r45285408 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -211,7 +211,7 @@ abstract class RDD[T: ClassTag]( // Our dependencies and

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2015-11-18 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/9483#discussion_r45285157 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/ParallelUnionRDD.scala --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-11-08 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-154919462 `CombineHiveInputFormat` or `CombineFileInputFormat` would have the restriction that it would always suppose the combined inputformat is a subclass of

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-11-08 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-154917321 @watermen , Yes. It should support all formats in theory, since it combine on `InputSplit` level which is the result of `inputformat.getSplits`. In other words

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-11-04 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-153973454 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-11-04 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-153934785 @chenghao-intel Just tested with data which have 15w small files and 1000 partitions. 1) SQL (select count(*) from test), only improve a little bit, I guess

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2015-11-04 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9483#issuecomment-153930848 cc @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11517][SQL]Calc partitions in parallel ...

2015-11-04 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/9483 [SPARK-11517][SQL]Calc partitions in parallel for multiple partitions table Currently we calculate the getPartitions for each "hive partition" in sequence way, it would be faster

[GitHub] spark pull request: [SPARK-11121][Core] Correct the TaskLocation t...

2015-10-20 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-149745579 cc @srowen @andrewor14 seems it pretty close to be merged. any more comments? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-10656][SQL]fix selection fails when a c...

2015-10-20 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8811#issuecomment-149745245 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11121][Core] Correct the TaskLocation t...

2015-10-19 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-149387561 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11121][Core] Correct the TaskLocation t...

2015-10-19 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-149125256 I think there would be no other hidden pitfall. Just grep the code, most of the HDFSCacheTaskLocation is created explicitly except for (https://github.com/apache

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-10-15 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/9097#discussion_r42201471 --- Diff: sql/hive/src/main/java/org/apache/spark/sql/hive/mapred/CombineSplitRecordReader.java --- @@ -0,0 +1,128 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-11121][Core] Correct the TaskLocation t...

2015-10-14 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-148296512 Sounds good to have one. let me update the unit test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-11121][Core] Correct the TaskLocation t...

2015-10-14 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-148255163 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11121][Core] Correct the TaskLocation t...

2015-10-14 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-148243280 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Core] Correct the TaskLocation type

2015-10-13 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9096#issuecomment-147929956 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-10-13 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-147898449 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-8813][SQL][WIP]Combine splits by size

2015-10-13 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/9097#issuecomment-147898425 cc @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Core]Remove useless if branch

2015-10-13 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/9096#discussion_r41945397 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala --- @@ -62,12 +62,5 @@ private[spark] object TaskLocation { * These

[GitHub] spark pull request: [WIP]Combine splits by size

2015-10-13 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/9097 [WIP]Combine splits by size The idea is simple and it try to solve this problem by combining splits by size which has been generated by the underlying inputformat, so it would support all of

[GitHub] spark pull request: [Core]Remove useless if branch

2015-10-13 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/9096 [Core]Remove useless if branch We don't need the if checking here. it's redundant. The final result would always use `hstr` You can merge this pull request into a Git repository

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-09-24 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8915#issuecomment-143136694 ping @chenghao-intel @yhuai @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-09-24 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/8915#discussion_r40401487 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala --- @@ -174,8 +174,6 @@ private[hive] class

[GitHub] spark pull request: [SPARK-10427][SQL]Respect -S option for spark-...

2015-09-24 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/8915 [SPARK-10427][SQL]Respect -S option for spark-sql "-S" option is ignored by the conf setting logic at the moment, we need to pick up this to complete the "silent" functio

[GitHub] spark pull request: [SPARK-10310] [SQL] Fixes script transformatio...

2015-09-22 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8860#issuecomment-142469660 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-09-21 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8476#issuecomment-142178824 @liancheng thanks for taking care of this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-09-21 Thread zhichao-li
Github user zhichao-li closed the pull request at: https://github.com/apache/spark/pull/8476 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-10310] [SQL] Fixes script transformatio...

2015-09-21 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8860#issuecomment-142178297 @liancheng I guess there's still some issue like: i.e: If user specify serde with "LazySimpleSerde" explicitly, then it would use `Text.writ

[GitHub] spark pull request: [SPARK-10656][SQL]fix selection fails when a c...

2015-09-21 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8811#issuecomment-142159010 @chenghao-intel, ` UnresolvedAttribute.parseAttributeName` still lack the ability to solve case like this: ``name``, so we cannot add the back tick to all cases

[GitHub] spark pull request: [SPARK-10656][SQL]fix selection fails when a c...

2015-09-21 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/8811#discussion_r40047528 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1116,6 +1116,18 @@ class SQLQuerySuite extends

[GitHub] spark pull request: [SPARK-10656][SQL]fix selection fails when a c...

2015-09-20 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8811#issuecomment-141891350 cc @chenghao-intel @yhuai @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-10656][SQL]fix selection fails when a c...

2015-09-18 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/8811 [SPARK-10656][SQL]fix selection fails when a column has special characters Best explained with this example: val df = sqlContext.read.json(sqlContext.sparkContext.makeRDD( "&q

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-09-07 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8476#issuecomment-138214691 Previously there was no `TextRecordWriter` or `TextRecordReader` involved, only manually read and use `Writable.write()` for serialization. I would remove the

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-09-07 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/8476#discussion_r38838257 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformation.scala --- @@ -328,23 +361,27 @@ case class HiveScriptIOSchema

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-09-07 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/8476#discussion_r38838170 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala --- @@ -429,7 +429,8 @@ class HiveQuerySuite extends

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-09-01 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8476#issuecomment-136914580 cc @yhuai @liancheng would you mind taking look at this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-30 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8476#issuecomment-136226698 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10310][SQL]Using \t as the field delime...

2015-08-26 Thread zhichao-li
GitHub user zhichao-li opened a pull request: https://github.com/apache/spark/pull/8476 [SPARK-10310][SQL]Using \t as the field delimeter and \n as the line delimeter Currently we are using `LazySimpleSerDe` to serialize the script input by default. but it would use '\001

[GitHub] spark pull request: [SPARK-8813][SQL] Combine files when there're ...

2015-08-20 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/8125#issuecomment-133259180 Hive also facing such similar problem and solved this by `CombineHiveInputFormat` which base on `CombineFileInputFormat`, not sure if we can integrate that into

[GitHub] spark pull request: [SPARK-6621][Core] Fix the bug that calling Ev...

2015-08-10 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/5280#discussion_r36704746 --- Diff: core/src/main/scala/org/apache/spark/util/EventLoop.scala --- @@ -76,9 +76,21 @@ private[spark] abstract class EventLoop[E](name: String

[GitHub] spark pull request: [SPARK-6621][Core] Fix the bug that calling Ev...

2015-08-10 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/5280#discussion_r36637068 --- Diff: core/src/main/scala/org/apache/spark/util/EventLoop.scala --- @@ -76,9 +76,21 @@ private[spark] abstract class EventLoop[E](name: String

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-08-06 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/7709#issuecomment-128281099 retest this please. It fail in not related test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-08-05 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r36381923 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -349,6 +351,78 @@ case class EndsWith(left

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-08-05 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/7709#issuecomment-127915988 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-08-03 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r36153127 --- Diff: unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -741,6 +742,21 @@ public static UTF8String concatWs(UTF8String

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-08-03 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/7709#issuecomment-127144590 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-07-31 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r35995498 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -283,6 +283,118 @@ case class EndsWith(left

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-07-31 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r35995074 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -1718,6 +1718,18 @@ object functions { def ascii(e: Column): Column

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-07-31 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r35994906 --- Diff: unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -508,6 +508,43 @@ public static UTF8String concatWs(UTF8String

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-07-31 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r35994820 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -283,6 +283,118 @@ case class EndsWith(left

[GitHub] spark pull request: [SPARK-8266][SQL]add function translate

2015-07-31 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7709#discussion_r35950539 --- Diff: unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -507,6 +507,48 @@ public static UTF8String concatWs(UTF8String

[GitHub] spark pull request: [SPARK-7119][SQL]Give script a default serde w...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/6638#issuecomment-126190874 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-7119][SQL]Give script a default serde w...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/6638#issuecomment-126180428 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8263][SQL] substr/substring should also...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/7641#issuecomment-126167866 cc @rxin @davies @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-7119][SQL]Give script a default serde w...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/6638#issuecomment-126166722 @JoshRosen Could you pls take a look at this changes? This pr just simply give the script a default serde if none of the `formatter` and `serde` is given

[GitHub] spark pull request: [SPARK-7119][SQL]Give script a default serde w...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/6638#discussion_r35739540 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala --- @@ -75,49 +53,4 @@ class

[GitHub] spark pull request: [SPARK-7119][SQL]Give script a default serde w...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/6638#issuecomment-125876589 Essentially not much code added for this pr, mainly delete some and always give the script a default serde. Would rebase the code shortly. --- If your project is

[GitHub] spark pull request: [SPARK-8263][SQL] substr/substring should also...

2015-07-29 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/7641#discussion_r35732675 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -699,8 +732,12 @@ case class Substring(str

  1   2   3   >