[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-218855824 Yeah sure @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 closed the pull request at: https://github.com/apache/spark/pull/9553 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12462][SQL] Add ExpressionDescription t...
Github user xguo27 closed the pull request at: https://github.com/apache/spark/pull/10437 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12981][SQL] Fix Python UDF extraction f...
Github user xguo27 closed the pull request at: https://github.com/apache/spark/pull/10935 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12981][SQL] Fix Python UDF extraction f...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10935#issuecomment-204840629 Sure @davies . I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12981][SQL] Fix Python UDF extraction f...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10935#issuecomment-190554648 Using these two functionally equavalent code snippets: Scala ``` val data = Seq((1, "1"), (2, "2"), (3, "2"), (1, "3")).toDF("a","b") val my_filter = sqlContext.udf.register("my_filter", (a:Int) => a==1) data.select(col("a")).distinct().filter(my_filter(col("a"))) ``` Python ``` data = sqlContext.createDataFrame([(1, "1"), (2, "2"), (3, "2"), (1, "3")], ["a", "b"]) my_filter = udf(lambda a: a == 1, BooleanType()) data.select(col("a")).distinct().filter(my_filter(col("a"))) ``` The logical plan comes out `execute(aggregateCondition)` in here is as below: https://github.com/apache/spark/blob/916fc34f98dd731f607d9b3ed657bad6cc30df2c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L801 Scala ``` Aggregate [a#8], [UDF(a#8) AS havingCondition#11] +- Project [a#8] +- Project [_1#6 AS a#8,_2#7 AS b#9] +- LocalRelation [_1#6,_2#7], [[1,1],[2,2],[3,2],[1,3]] ``` Python ``` Project [havingCondition#2] +- Aggregate [a#0L], [pythonUDF#3 AS havingCondition#2] +- EvaluatePython PythonUDF#(a#0L), pythonUDF#3: boolean +- Project [a#0L] +- LogicalRDD [a#0L,b#1], MapPartitionsRDD[4] at applySchemaToPythonRDD at NativeMethodAccessorImpl.java:-2 ``` We can see in Python's case, we inject an extra Project when `execute(aggregateCondition)`going through ExtractPythonUDFs, but ResolveAggregateFunctions expects an Aggregate here: https://github.com/apache/spark/blob/916fc34f98dd731f607d9b3ed657bad6cc30df2c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L801-L805 With this fix, the logical plan generated for Python UDFs does not construct a Project if it is an Aggregate, making it consistent with its Scala counterpart, which gives correct results for ResolveAggregateFunctions to consume: After fix, Python: ``` Aggregate [a#0L], [pythonUDF#3 AS havingCondition#2] +- EvaluatePython PythonUDF#(a#0L), pythonUDF#3: boolean +- Project [a#0L] +- LogicalRDD [a#0L,b#1], MapPartitionsRDD[4] at applySchemaToPythonRDD at NativeMethodAccessorImpl.java:-2 ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12981][SQL] Fix Python UDF extraction f...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10935#issuecomment-189430834 @rxin Does this fix look good to you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13422][SQL] Use HashedRelation instead ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11291#issuecomment-186923232 @hvanhovell I just rebased with your new PR, do you mind reviewing again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13422][SQL] Use HashedRelation instead ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11291#issuecomment-186893918 @hvanhovell In hashSemiJoin() function, when condition is empty, the boundCondition always evaluates to true here: https://github.com/apache/spark/blob/8f744fe3d931c2380613b8e5bafa1bb1fd292839/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashSemiJoin.scala#L42-L43 so the exists{...} part of these lines behaves as a No-Op. https://github.com/apache/spark/blob/8f744fe3d931c2380613b8e5bafa1bb1fd292839/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashSemiJoin.scala#L87-L89 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13422][SQL] Use HashedRelation instead ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11291#issuecomment-186878372 @hvanhovell I see, sorry for my lack of patience. : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13422][SQL] Use HashedRelation instead ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11291#issuecomment-186876120 Looks like the command did not trigger a test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13422][SQL] Use HashedRelation instead ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11291#issuecomment-186872660 @hvanhovell Could you please advise whether this is the right fix? All Left Semi related tests passed, but I'm not sure what other impact there might be to remove HashSet related methods. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13422][SQL] Use HashedRelation instead ...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/11291 [SPARK-13422][SQL] Use HashedRelation instead of HashSet in Left Semi Joins You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-13422 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11291.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11291 commit a84975a5fcee4b59cd144e23cca806970dc58164 Author: Xiu Guo <xgu...@gmail.com> Date: 2016-02-21T17:10:01Z [SPARK-13422][SQL] Use HashedRelation instead of HashSet in Left Semi Joins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13366] Support Cartesian join for Datas...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11244#issuecomment-186733828 Thanks @marmbrus ! I have updated the change following your suggestion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13366] Support Cartesian join for Datas...
Github user xguo27 commented on a diff in the pull request: https://github.com/apache/spark/pull/11244#discussion_r53263079 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -680,6 +681,14 @@ class Dataset[T] private[sql]( joinWith(other, condition, "inner") } + /** + * Joins this [[Dataset]] returning a [[Tuple2]] for each pair using cartesian join + * Note: cartesian joins are very expensive without a filter that can be pushed down. + * + * @since 2.0.0 + */ + def joinWith[U](other: Dataset[U]): Dataset[(T, U)] = joinWith(other, lit(true), "inner") --- End diff -- Thanks for your feedback @marmbrus . The only join API in Dataset I can find is: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L644 which expects a Column. Do you mean to add some other method like the one in Dataframe: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L383-L385 If so, I'm wondering whether we need to refactor out the code that handles encoder? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13366] Support Cartesian join for Datas...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/11244 [SPARK-13366] Support Cartesian join for Datasets You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-13366 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11244.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11244 commit 27a58df5a07138fc320353dce532955e8abee00d Author: Xiu Guo <xgu...@gmail.com> Date: 2016-02-17T22:16:51Z [SPARK-13366] Support Cartesian join for Datasets --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13283][SQL] Escape column names based o...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/11224#issuecomment-184916252 Yes @JoshRosen , you are referring to integration test, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13283][SQL] Escape column names based o...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/11224 [SPARK-13283][SQL] Escape column names based on JdbcDialect You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-13283 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11224.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11224 commit 8e143b68c102ec1b55d9e7a64ddf3ea40a95d28a Author: Xiu Guo <xgu...@gmail.com> Date: 2016-02-16T21:53:27Z [SPARK-13283][SQL] Escape column names based on JdbcDialect --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12981][SQL] Fix Python UDF extraction f...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/10935 [SPARK-12981][SQL] Fix Python UDF extraction for aggregation. When Aggregate operator being applied ExtractPythonUDFs rule, it becomes a Project. This change fixes that and maintain Aggregate operator to the original type. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-12981 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10935.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10935 commit d55146f6dd865bff9789a32641de6aa1678b912f Author: Xiu Guo <xgu...@gmail.com> Date: 2016-01-26T22:35:50Z [SPARK-12981][SQL] Fix Python UDF extraction for aggregation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on a diff in the pull request: https://github.com/apache/spark/pull/10515#discussion_r48788553 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/DefaultSource.scala --- @@ -70,15 +70,16 @@ class DefaultSource extends HadoopFsRelationProvider with DataSourceRegister { private[sql] class TextRelation( val maybePartitionSpec: Option[PartitionSpec], +val textSchema: Option[StructType], override val userDefinedPartitionColumns: Option[StructType], override val paths: Array[String] = Array.empty[String], parameters: Map[String, String] = Map.empty[String, String]) (@transient val sqlContext: SQLContext) extends HadoopFsRelation(maybePartitionSpec, parameters) { - /** Data schema is always a single column, named "value". */ - override def dataSchema: StructType = new StructType().add("value", StringType) - + /** Data schema is always a single column, named "value" if original Data source has no schema. */ + override def dataSchema: StructType = +textSchema.getOrElse(new StructType().add("value", StringType)) --- End diff -- @cloud-fan DefaultSource.scala is the only place that creates a TextRelation, and it verifies that the schema is size 1 and of type string before creating a TextRelation. So I think it is fine not to verify again here. What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10515#issuecomment-168564909 @marmbrus Can we trigger a test for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on a diff in the pull request: https://github.com/apache/spark/pull/10515#discussion_r48591393 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala --- @@ -33,8 +33,8 @@ class TextSuite extends QueryTest with SharedSQLContext { verifyFrame(sqlContext.read.text(testFile)) } - test("writing") { -val df = sqlContext.read.text(testFile) + test("SPARK-12562 verify write.text() can handle column name beyond `value`") { +val df = sqlContext.read.text(testFile).withColumnRenamed("value", "adwrasdf") --- End diff -- After `write.text()`, the local text file actually does not carry the schema name like JSON does. When reading back the text file and then call `verifyFrame`, it will always have `value` as the column name. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10515#issuecomment-167915743 @marmbrus Thanks Michael for your feedback! Looks like the 'value' is to give the single string column a arbitrary name. Current implementation strips schema information when creating TextRelation (after verifying the schema is single field with string type). It is fine during read, but fails during write. Would you mind taking another look at my updated change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/10515 [SPARK-12562][SQL] DataFrame.write.format(text) requires the column name to be called value You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-12562 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10515.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10515 commit 4c5c31fc18f2763151a9d4d6f42ceed5eb43d8a7 Author: Xiu Guo <xgu...@gmail.com> Date: 2015-12-29T23:49:44Z [SPARK-12562][SQL] DataFrame.write.format(text) requires the column name to be called value --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10515#issuecomment-167950674 Thanks @viirya ! I have updated the comment and added unit test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12562][SQL] DataFrame.write.format(text...
Github user xguo27 commented on a diff in the pull request: https://github.com/apache/spark/pull/10515#discussion_r48590105 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala --- @@ -58,6 +58,17 @@ class TextSuite extends QueryTest with SharedSQLContext { } } + test("SPARK-12562 verify write.text() can handle column name beyond `value`") { --- End diff -- @rxin I thought about it, but was not sure if it was a good idea to change the existing testcase. In the existing test, should I add a second dataframe with column renamed, or just replace the original dataframe with column renaming? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12512][SQL] support column name with do...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/10500 [SPARK-12512][SQL] support column name with dot in withColumn() You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-12512 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10500.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10500 commit 1b8d53c4692034ce1b292e74c44db506fdeea9af Author: Xiu Guo <xgu...@gmail.com> Date: 2015-12-28T23:37:21Z [SPARK-12512][SQL] support column name with dot in WithColumn() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12521][SQL][WIP] JDBCRelation does not ...
Github user xguo27 closed the pull request at: https://github.com/apache/spark/pull/10473 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12521][SQL][WIP] JDBCRelation does not ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10473#issuecomment-167258976 Thanks @hvanhovell for clarifying it up. I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12521][SQL][WIP] JDBCRelation does not ...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/10473 [SPARK-12521][SQL][WIP] JDBCRelation does not honor lowerBound/upperBound JDBCRelation is not bounding the rows when lowerBound/upperBound are given. This change honors the bounds given. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-12521 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10473.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10473 commit b0d03592716369edb390f7811a5d4d530bb0cfe2 Author: Xiu Guo <xgu...@gmail.com> Date: 2015-12-25T04:08:35Z [SPARK-12521][SQL] JDBCRelation does not honor lowerBound/upperBound --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12521][SQL][WIP] JDBCRelation does not ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10473#issuecomment-167190872 Marking it [WIP] to invite discussion here. : ) As I suspect the original code includes infinity on both smaller than side and greater than side for a reason. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12456][SQL] Add ExpressionDescription t...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10423#issuecomment-166701737 @rxin Great, thanks Reynold! My JIRA id is xguo27. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12462][SQL] Add ExpressionDescription t...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/10437 [SPARK-12462][SQL] Add ExpressionDescription to misc non-aggregate functions You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-12462 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10437.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10437 commit a0e99210ea7e3068cf07b9f042a084ab8223d7f2 Author: Xiu Guo <xgu...@gmail.com> Date: 2015-12-22T18:54:52Z [SPARK-12462][SQL] Add ExpressionDescription to misc non-aggregate functions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12456][SQL] Add ExpressionDescription t...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/10423 [SPARK-12456][SQL] Add ExpressionDescription to misc functions First try, not sure how much information we need to provide in the usage part. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-12456 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10423.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10423 commit 8cae22cd771eea1bc08d1e0903b5e9df6814 Author: Xiu Guo <xgu...@gmail.com> Date: 2015-12-21T22:55:44Z [SPARK-12456][SQL] Add ExpressionDescription to misc functions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12456][SQL] Add ExpressionDescription t...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/10423#issuecomment-166531147 @rxin Thank you very much for go through the changeset, Reynold! I have updated it per your suggestions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 commented on a diff in the pull request: https://github.com/apache/spark/pull/9553#discussion_r48095871 --- Diff: repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkILoop.scala --- @@ -1026,17 +1027,30 @@ class SparkILoop( @DeveloperApi def createSQLContext(): SQLContext = { -val name = "org.apache.spark.sql.hive.HiveContext" +useHiveContext = sparkContext.getConf.getBoolean("spark.sql.useHiveContext", true) +val name = { + if (useHiveContext) "org.apache.spark.sql.hive.HiveContext" + else "org.apache.spark.sql.SQLContext" +} + val loader = Utils.getContextOrSparkClassLoader try { sqlContext = loader.loadClass(name).getConstructor(classOf[SparkContext]) .newInstance(sparkContext).asInstanceOf[SQLContext] - logInfo("Created sql context (with Hive support)..") + if (useHiveContext) { +logInfo("Created sql context (with Hive support). To use sqlContext (without Hive), " + + "set spark.sql.useHiveContext to false before launching spark-shell.") + } + else { +logInfo("Created sql context.") + } } catch { - case _: java.lang.ClassNotFoundException | _: java.lang.NoClassDefFoundError => + case _: java.lang.ClassNotFoundException | _: java.lang.NoClassDefFoundError +if useHiveContext => sqlContext = new SQLContext(sparkContext) -logInfo("Created sql context..") +logInfo("Created sql context without Hive support, " + + "build Spark with -Phive to enable Hive support.") --- End diff -- When -Phive is used (which provides necessary hive jars) and an exception other than ClassNotFound/NoClassDefFound occured, now how we handle it is to let the exception be propagated without creating an alternative SqlContext. Do you mean by this case, we should catch -> log -> re-throw? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-165948481 @yhuai I just resolved the conflict. Can we trigger a test? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-162964945 Hi @yhuai, do you think this is good to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-160779029 Hi @yhuai @liancheng: As I was hitting SPARK-2 when testing my code, I rebased my branch and squashed my previous commits together. Now the new commit addresses the following point you brought up: 1. call conf.getBoolean() to get the conf value at the right place 2. using spark.sql.useHiveContext instead of spark.sql.hive.context 3. using if/else instead of cases true/false 4. provide extra information when logging Thanks for reviewing my code! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide option to switch Sq...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-160782796 Looks like some git plugin network issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-160482218 Thanks @yhuai for reviewing my code! I have updated per your suggestion. To answer your question, I personally do not have a use case for this. My take on the JIRA reporter's use case is that user might host their own customized/modified Hive jars on their maven site which might provide specific functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-160332622 @yhuai I see your latest delivery has conflict with this PR, I have resolved the conflict and re-pushed. @rxin has been reviewing this PR, I figure you might also want to review this PR, just in case I break your code. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11631][Scheduler] Adding 'Starting DAGS...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9603#issuecomment-159191082 OK, I will close it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11631][Scheduler] Adding 'Starting DAGS...
Github user xguo27 closed the pull request at: https://github.com/apache/spark/pull/9603 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11897][SQL] Add @scala.annotations.vara...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/9918 [SPARK-11897][SQL] Add @scala.annotations.varargs to sql functions You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-11897 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9918.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9918 commit 1dc9751b8e68ab5c1f681b74c2283eb29addc3b8 Author: Xiu Guo <xgu...@gmail.com> Date: 2015-11-23T22:26:47Z [SPARK-11897][SQL] Add @scala.annotations.varargs to sql functions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11631][Scheduler] Adding 'Starting DAGS...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9603#issuecomment-159079918 @andrewor14 What is your take on Jacek's comment? I don't think it's a bad idea to make it more consistent with a matching log message. Please let me know. Thx! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-158694089 Sorry about the failure, can we re-test please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-158678073 @rxin Thanks, Reynold! Somehow no test was triggered. Not sure why. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11628][SQL] support column datatype of ...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9612#issuecomment-158678513 @cloud-fan I have added a few tests per your suggestion. Do they look good to you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide user an option to i...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-158249118 @marmbrus @rxin Does this look good to you guys? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-158248923 @marmbrus @rxin What do you think about this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11631][Scheduler] Adding 'Starting DAGS...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9603#issuecomment-157219217 I agree it is trivial, just thought I could quickly add a log statement. If Jacek agrees, I can close those PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11631][Scheduler] Adding 'Starting DAGS...
Github user xguo27 commented on a diff in the pull request: https://github.com/apache/spark/pull/9603#discussion_r44728655 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -506,6 +506,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli val (sched, ts) = SparkContext.createTaskScheduler(this, master) _schedulerBackend = sched _taskScheduler = ts +logDebug("Starting DAGScheduler") --- End diff -- Hi Jacek: My only concern with putting log in DAGScheduler's constructor is that the logger might not have been initialized when constructor is called. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11628][SQL][WIP] support column datatyp...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9612#issuecomment-156276620 Hi Wenchen: Can you elaborate on using ByteType for char a little more? Ultimately, the difference between char(x) and varchar(x) is the fixed/variable length, which results in padding. So it's a good idea to keep the underlying type the same, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11628][SQL][WIP] support column datatyp...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/9612 [SPARK-11628][SQL][WIP] support column datatype of Char Can someone review my code to make sure I'm not missing anything? Thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-11628 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9612.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9612 commit 12a7ba151291691d1695fc456da65fb3c005fc2d Author: Xiu Guo <gu...@us.ibm.com> Date: 2015-11-11T00:44:22Z [SPARK-11628][SQL] support column datatype of Char --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide user an option to i...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9553#issuecomment-155257787 Hi Zhan: I just updated documentation and added a guard in the code regarding your feedback on the exception handler. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9543#issuecomment-155151066 Thanks WangTao for your comment! Based on the comment on my other PR for Spark-11562, I will also add documentation for this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11562][SQL] Provide user an option to i...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/9553 [SPARK-11562][SQL] Provide user an option to init SQLContext or HiveContext in spark shell Introducing a boolean property 'spark.sql.hive.context' to turn HiveContext on and off as the default sqlContext type. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-11562 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9553.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9553 commit cb5892cdd605ec70586c0670ed19d924e0a8eade Author: Xiu Guo <gu...@us.ibm.com> Date: 2015-11-08T20:33:07Z [SPARK-11562][SQL] Provide user an option to init SQLContext or HiveContext in spark shell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11482][SQL] Make maven repo for Hive me...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/9543 [SPARK-11482][SQL] Make maven repo for Hive metastore jars configurable Introducing a property called "spark.sql.hive.maven.repo" to let user configure the maven repository to download Hive Metastore jars. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-11482 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9543.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9543 commit 01352e781ffc62f460c70c865d519531cb336805 Author: Xiu Guo <gu...@us.ibm.com> Date: 2015-11-07T02:01:04Z [SPARK-11482][SQL] Make maven repo for Hive metastore jars configurable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11242][SQL] In conf/spark-env.sh.templa...
GitHub user xguo27 opened a pull request: https://github.com/apache/spark/pull/9201 [SPARK-11242][SQL] In conf/spark-env.sh.template SPARK_DRIVER_MEMORY is documented incorrectly Minor fix on the comment You can merge this pull request into a Git repository by running: $ git pull https://github.com/xguo27/spark SPARK-11242 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9201.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9201 commit 5a11872efbb3b871fae900bd0228fcbfb25ad0e1 Author: guoxi <gu...@us.ibm.com> Date: 2015-10-21T18:56:33Z [SPARK-11242] In conf/spark-env.sh.template SPARK_DRIVER_MEMORY is documented incorrectly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11242][SQL] In conf/spark-env.sh.templa...
Github user xguo27 commented on the pull request: https://github.com/apache/spark/pull/9201#issuecomment-150054928 Right, let me change that too. Thx Sean! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org