[GitHub] spark pull request: [SPARK-2237][CORE]Add ZLIBCompressionCodec cod...

2014-09-03 Thread YanjieGao
Github user YanjieGao closed the pull request at: https://github.com/apache/spark/pull/1121 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2237][CORE]Add ZLIBCompressionCodec cod...

2014-09-03 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1121#issuecomment-54283956 Ok,i thinks so ,may be some small third-party lib may cause uncertain problem , I will close this PR. If have some mature solution , i will send PR and make a

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-09-03 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-54283740 Hi marmbrus,I will close it. Best Regards --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-09-03 Thread YanjieGao
Github user YanjieGao closed the pull request at: https://github.com/apache/spark/pull/1134 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2240][SQL]Spark SQL add LeftSemiBloomFi...

2014-09-03 Thread YanjieGao
Github user YanjieGao closed the pull request at: https://github.com/apache/spark/pull/1127 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2240][SQL]Spark SQL add LeftSemiBloomFi...

2014-09-03 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1127#issuecomment-54283611 Hi marmbrus , Got it , if i have some other good idea i will try to communicate with you ,Thanks ,I will close it latter. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-2373]RDD add span function (split an RD...

2014-08-29 Thread YanjieGao
Github user YanjieGao closed the pull request at: https://github.com/apache/spark/pull/1306 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2373]RDD add span function (split an RD...

2014-08-29 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1306#issuecomment-53863401 Ok ,Got it, I will close this PR ; --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2239][SQL]Spark SQL basicOperator add D...

2014-07-23 Thread YanjieGao
Github user YanjieGao closed the pull request at: https://github.com/apache/spark/pull/1145 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2239][SQL]Spark SQL basicOperator add D...

2014-07-23 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1145#issuecomment-49954380 Got it ,Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48752406 Hi I rewrite the code ,and resolve some former problem --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1134#discussion_r14811559 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins.scala --- @@ -400,3 +401,73 @@ case class BroadcastNestedLoopJoin

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48707289 Hi , I also make a left semi join .I don't know is this join as a optimization as the left semi join or as a single join algorithm. I think the 1127 PR also has

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48707013 Thanks Michael , (1) We could make it as a user hint ,like hive does . set hive.optimize.skewjoin = true; set hive.skewjoin.key = skew_key_threshold ï

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-07 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1150#issuecomment-48266319 Hi, Michael I also have a Skew Join pr need to be reviewed and need some suggestions .Could you help me review it ?I have test it can pass the testsuite .Thanks a lot

[GitHub] spark pull request: [SPARK-2373]RDD add span function (split an RD...

2014-07-07 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1306#issuecomment-48149122 This function is useful in some cases ,Such as when i do Skew Join in another PR,I need to split an RDD to two RDD,One has skew keys ,and the other is not . val

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-06 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48143209 Hi all, I rewrite most of the code and the testsuite can pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-2373]RDD add span function (split an RD...

2014-07-05 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1306#issuecomment-48100374 Thanks ,I optimize the code so it only evaluates the function once .Other comments are on JIRA --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-05 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1150#issuecomment-48099853 Thanks a lot, I have reformat the code style. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2373]RDD add span function (split an RD...

2014-07-05 Thread YanjieGao
GitHub user YanjieGao opened a pull request: https://github.com/apache/spark/pull/1306 [SPARK-2373]RDD add span function (split an RDD to two RDD based on user's function)] def span(p: T => Boolean): (RDD[T], RDD[T]) Splits this RDD into a prefix/suffix pair accord

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-04 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1150#issuecomment-48078447 Thanks ,I add three blank after that line ,Now there aren't red lines. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: Branch 1.0 Add ZLIBCompressionCodec code

2014-07-04 Thread YanjieGao
Github user YanjieGao closed the pull request at: https://github.com/apache/spark/pull/1115 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-04 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1150#issuecomment-48030332 Hi Michael,I have resolve the conflict . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-04 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1150#discussion_r14555293 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -274,7 +274,7 @@ private[sql] abstract class SparkStrategies

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-04 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1150#discussion_r14553981 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala --- @@ -205,3 +205,15 @@ object ExistingRdd { case class

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-04 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1150#discussion_r14553975 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -623,6 +623,8 @@ private[hive] object HiveQl { queries.reduceLeft

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-07-04 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-48026439 Hi Michael,I also have a similar pr #1150 [SPARK-2235][SQL]Spark SQL basicOperator add Intersect operator ,Could you help me to review it? Thanks ! --- If your

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-07-03 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-48005168 Hi all , This pr all test has passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-07-03 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-48001405 Thanks a lot all ,I have modify the code as your suggestion ,next time i will match the Spark coding style --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-02 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-47870392 Hi all. I have resolve the conflict. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2240][SQL]Spark SQL add LeftSemiBloomFi...

2014-07-02 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1127#issuecomment-47869475 Hi all ,I have resolve the conflict . I don't know if this pr has the value to be merged --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-2237][CORE]Add ZLIBCompressionCodec cod...

2014-07-02 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1121#issuecomment-47869431 Hi all,I have resolve the conflict i don't know if this pr has value to be merged? Thanks a lot --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-07-02 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1150#issuecomment-47869343 Hi all, I have resolved the conflict . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-07-02 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-47864285 Thanks a lot, Michael,I have modify the code . Merge build use two hours .But I saw the console test log error. I don't know if the new code is the main cause o

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-07-02 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-47774796 Hi all,I have resolved the conflict and merged with the master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-29 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-47480949 Hi all, I have modify the files and update the code as your suggestiones.The build has triggered but it didn't merged . I don't know what's the main c

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-27 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-47329035 Thanks I have modify the line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-26 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-47303095 Hi all,What should i do next ! Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-26 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-47302963 Hi Michael, adrian-wang ,Thanks a lot ,I have update all the files as your suggestion! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-06-25 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-47173420 Hi All,I update 8 files like the pull add EXCEPT operator .But when i exec the test ,it exec case class CartesianProduct operator.I think there are some mistakes in my

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-06-24 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-46938049 Thanks a lot ,Chenghao . This code like a demo ,i think we could through improve sample phrase and use some strategy to judge the which key set are skew keys. we can

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-23 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-46890516 Thanks a lot,Michael .I have update the code .I don't know if all the files are right? Can these files be merged ?Thanks.a lot! --- If your project is set up f

[GitHub] spark pull request: [SPARK-2235][SQL]Spark SQL basicOperator add I...

2014-06-23 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1150#issuecomment-46817079 hi all ,I have finished the INTERSECT operator ,and has update the 6 files this operator needed . Now i need your help to review the code .Thanks a lot ! --- If

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-23 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-46814099 Hi all, I have finished this Subtract Operator. These code can run and pass compile.These code need to be reviewed thanks a lot! (1)because there is a

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1151#discussion_r14060932 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -119,6 +119,7 @@ class SqlParser extends StandardTokenParsers

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1151#discussion_r14060925 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -369,6 +369,17 @@ class SQLQuerySuite extends QueryTest { (3

[GitHub] spark pull request: [SPARK-2234][SQL]Spark SQL basicOperators add ...

2014-06-22 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-46800258 Hi marmbrus I update these files as your comment tips ,but i think i may make some mistakes in the code .Could you help me and give me some tips ?I will continue

[GitHub] spark pull request: Spark SQL add LeftSemiBloomFilterBroadcastJoin

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1127#issuecomment-46769416 Thanks a lot .My intellij setting is in the two image .I think I may have some error in settings . Can you help what' the inconformity with your ide. I wan

[GitHub] spark pull request: Spark SQL basicOperators add Except operator

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-46769120 Thanks a lot , It's very nice of you .I will work around it .And then add code in the other files .I have some problems about some syntax .I have sent a ma

[GitHub] spark pull request: Branch 1.0 Add ZLIBCompressionCodec code

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1115#issuecomment-46755267 Thanks a lot , I will do it as you said .I once submit it as I fork spark reposity on the web ,and I write the code and run it on intellij .Then i edit the scala

[GitHub] spark pull request: Branch 1.0 Add ZLIBCompressionCodec code

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1115#issuecomment-46754558 Hi Srowen , markhamstra . I want to merge this to the master branch.Last time i make a mistake . I resubmit this patch in https://github.com/apache/spark

[GitHub] spark pull request: Spark SQL add LeftSemiBloomFilterBroadcastJoin

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1127#issuecomment-46754487 Hi Zongheng, I reformat the code .I don't know if that is ok. And i hope you can give me more suggestions . Thanks a lot --- If your project is set up for it

[GitHub] spark pull request: Spark SQL basicOperators add Except operator

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1151#issuecomment-46754425 Hi Zongheng, I try it ,and try add code like other operator. I don't know if i want to add this except operator ,do i need to add code or modify code in other

[GitHub] spark pull request: SparkSQL add SkewJoin

2014-06-21 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-46754360 Hi rxin,I reformat it . Can you give me some suggestions.I will try to make it better. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: Spark SQL basicOperators add Ecept operator

2014-06-20 Thread YanjieGao
GitHub user YanjieGao opened a pull request: https://github.com/apache/spark/pull/1151 Spark SQL basicOperators add Ecept operator Hi all, I want to submit a Except operator in basicOperators.scala In SQL case.SQL support two table do except operator. select * from

[GitHub] spark pull request: Spark SQL basicOperator add Intersect operator

2014-06-20 Thread YanjieGao
GitHub user YanjieGao opened a pull request: https://github.com/apache/spark/pull/1150 Spark SQL basicOperator add Intersect operator Hi all, I want to submit a basic operator Intersect For example , in sql case select * from table1 intersect select * from