[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-02-02 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-72568272
  
Closing this since #4014 has been merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-02-02 Thread chenghao-intel
Github user chenghao-intel closed the pull request at:

https://github.com/apache/spark/pull/4158


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4158#discussion_r23514639
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -633,14 +633,28 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
Token(script, Nil) ::
Token("TOK_SERDE", serdeClause) ::
Token("TOK_RECORDREADER", readerClause) ::
-   outputClause :: Nil) :: Nil) =>
+   outputClause) :: Nil) =>
 
+// TODO the output should be bind with the output clause or 
RecordReader
 val output = outputClause match {
-  case Token("TOK_ALIASLIST", aliases) =>
+  case Token("TOK_ALIASLIST", aliases) :: Nil =>
 aliases.map { case Token(name, Nil) => 
AttributeReference(name, StringType)() }
-  case Token("TOK_TABCOLLIST", attributes) =>
+  case Token("TOK_TABCOLLIST", attributes) :: Nil =>
 attributes.map { case Token("TOK_TABCOL", Token(name, Nil) 
:: dataType :: Nil) =>
   AttributeReference(name, nodeToDataType(dataType))() }
+  case Nil => // Not specified the output field names, let it 
be the same as input
+(0 to inputExprs.length - 1).map { idx =>
+  // Keep the same as Hive does, the first field names is 
"key", and second is
+  // "value", however, Hive seems gives null string for 
the rest of the
--- End diff --

OK, I see. thanks for the explanation. I will update that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/4158#discussion_r23514311
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -633,14 +633,28 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
Token(script, Nil) ::
Token("TOK_SERDE", serdeClause) ::
Token("TOK_RECORDREADER", readerClause) ::
-   outputClause :: Nil) :: Nil) =>
+   outputClause) :: Nil) =>
 
+// TODO the output should be bind with the output clause or 
RecordReader
 val output = outputClause match {
-  case Token("TOK_ALIASLIST", aliases) =>
+  case Token("TOK_ALIASLIST", aliases) :: Nil =>
 aliases.map { case Token(name, Nil) => 
AttributeReference(name, StringType)() }
-  case Token("TOK_TABCOLLIST", attributes) =>
+  case Token("TOK_TABCOLLIST", attributes) :: Nil =>
 attributes.map { case Token("TOK_TABCOL", Token(name, Nil) 
:: dataType :: Nil) =>
   AttributeReference(name, nodeToDataType(dataType))() }
+  case Nil => // Not specified the output field names, let it 
be the same as input
+(0 to inputExprs.length - 1).map { idx =>
+  // Keep the same as Hive does, the first field names is 
"key", and second is
+  // "value", however, Hive seems gives null string for 
the rest of the
--- End diff --

I think it is expected results as the Hive manual describes about 
'Schema-less Map-reduce Scripts
' in 
[transform](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform):
 
> If there is no AS clause after USING my_script, Hive assumes that the 
output of the script contains 2 parts: key which is before the first tab, and 
value which is the rest after the first tab.

So in your results, `value` column gets all query outputs after the first 
tab. The results of table `test2` is just the alignment problem caused by tabs. 
It should follow the same rule too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71407251
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26069/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71407248
  
  [Test build #26069 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26069/consoleFull)
 for   PR 4158 at commit 
[`5618fa7`](https://github.com/apache/spark/commit/5618fa7914fefee9ac6fbd6dba17ba8f6e1ff5bd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4158#discussion_r23509725
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformation.scala
 ---
@@ -54,23 +55,47 @@ case class ScriptTransformation(
   val outputStream = proc.getOutputStream
   val reader = new BufferedReader(new InputStreamReader(inputStream))
 
+  // This projection outputs to the script, which runs in a single 
process
+  // TODO a Writer SerDe will be placed here.
+  val inputProjection = new InterpretedProjection(input, child.output)
+
+  // This projection is casting the scripts output into user specified 
data type
+  // TODO a Reader SerDe will be placed here for the casting the output
+  // data type into the required one
+  val outputProjection = new 
InterpretedProjection(output.zipWithIndex.map {
+case (attr, idx) => if (attr.dataType == StringType) {
+BoundReference(idx, StringType, true)
--- End diff --

Thanks, Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71403889
  
  [Test build #26069 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26069/consoleFull)
 for   PR 4158 at commit 
[`5618fa7`](https://github.com/apache/spark/commit/5618fa7914fefee9ac6fbd6dba17ba8f6e1ff5bd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4158#discussion_r23509647
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -633,14 +633,28 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
Token(script, Nil) ::
Token("TOK_SERDE", serdeClause) ::
Token("TOK_RECORDREADER", readerClause) ::
-   outputClause :: Nil) :: Nil) =>
+   outputClause) :: Nil) =>
 
+// TODO the output should be bind with the output clause or 
RecordReader
 val output = outputClause match {
-  case Token("TOK_ALIASLIST", aliases) =>
+  case Token("TOK_ALIASLIST", aliases) :: Nil =>
 aliases.map { case Token(name, Nil) => 
AttributeReference(name, StringType)() }
-  case Token("TOK_TABCOLLIST", attributes) =>
+  case Token("TOK_TABCOLLIST", attributes) :: Nil =>
 attributes.map { case Token("TOK_TABCOL", Token(name, Nil) 
:: dataType :: Nil) =>
   AttributeReference(name, nodeToDataType(dataType))() }
+  case Nil => // Not specified the output field names, let it 
be the same as input
+(0 to inputExprs.length - 1).map { idx =>
+  // Keep the same as Hive does, the first field names is 
"key", and second is
+  // "value", however, Hive seems gives null string for 
the rest of the
--- End diff --

Thanks for notice that. I think this's probably a bug in Hive.
I did the queries in Hive CLI:
```
set hive.cli.print.header=true;
select transform(key + 1, key - 1, key) using '/bin/cat' from src limit 4;
```

![](https://raw.githubusercontent.com/chenghao-intel/githubimages/master/Selection_001.png)
```
create table test2 as select transform(key + 1, key - 1, key) using 
'/bin/cat' from src limit 4;
```

![](https://raw.githubusercontent.com/chenghao-intel/githubimages/master/Selection_002.png)
And print the result of the table `test2`:

![](https://raw.githubusercontent.com/chenghao-intel/githubimages/master/Selection_003.png)

You will see, it's not the expected result, of `key` and `value`, that's 
why I added the default field name for more than 2 columns.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-23 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71181584
  
@chenghao-intel overall it looks good for me except for small comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-23 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/4158#discussion_r23444792
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformation.scala
 ---
@@ -54,23 +55,47 @@ case class ScriptTransformation(
   val outputStream = proc.getOutputStream
   val reader = new BufferedReader(new InputStreamReader(inputStream))
 
+  // This projection outputs to the script, which runs in a single 
process
+  // TODO a Writer SerDe will be placed here.
+  val inputProjection = new InterpretedProjection(input, child.output)
+
+  // This projection is casting the scripts output into user specified 
data type
+  // TODO a Reader SerDe will be placed here for the casting the output
+  // data type into the required one
+  val outputProjection = new 
InterpretedProjection(output.zipWithIndex.map {
+case (attr, idx) => if (attr.dataType == StringType) {
+BoundReference(idx, StringType, true)
--- End diff --

`BoundReference` can be out of the if block.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-23 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/4158#discussion_r23444732
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala ---
@@ -633,14 +633,28 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
Token(script, Nil) ::
Token("TOK_SERDE", serdeClause) ::
Token("TOK_RECORDREADER", readerClause) ::
-   outputClause :: Nil) :: Nil) =>
+   outputClause) :: Nil) =>
 
+// TODO the output should be bind with the output clause or 
RecordReader
 val output = outputClause match {
-  case Token("TOK_ALIASLIST", aliases) =>
+  case Token("TOK_ALIASLIST", aliases) :: Nil =>
 aliases.map { case Token(name, Nil) => 
AttributeReference(name, StringType)() }
-  case Token("TOK_TABCOLLIST", attributes) =>
+  case Token("TOK_TABCOLLIST", attributes) :: Nil =>
 attributes.map { case Token("TOK_TABCOL", Token(name, Nil) 
:: dataType :: Nil) =>
   AttributeReference(name, nodeToDataType(dataType))() }
+  case Nil => // Not specified the output field names, let it 
be the same as input
+(0 to inputExprs.length - 1).map { idx =>
+  // Keep the same as Hive does, the first field names is 
"key", and second is
+  // "value", however, Hive seems gives null string for 
the rest of the
--- End diff --

According to Hive manual, there should be only two outputs `key` and 
`value` when no output schema is defined. So I am not sure if it is a bug 
because it is explictly described in the manual. I suppose that it is a 
well-known and expected behavior?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71153531
  
  [Test build #25996 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25996/consoleFull)
 for   PR 4158 at commit 
[`a7b6989`](https://github.com/apache/spark/commit/a7b698945856eb3412aeef92ba22e4956371eb66).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71153535
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25996/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71149913
  
@viirya I've updated the code, which is a block issue for our partner, it's 
would be great if you can review this for me. And definitely the TODOs I leave 
there can be done in your PR #4014


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71149756
  
  [Test build #25996 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25996/consoleFull)
 for   PR 4158 at commit 
[`a7b6989`](https://github.com/apache/spark/commit/a7b698945856eb3412aeef92ba22e4956371eb66).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-71128096
  
Thank you @viirya . This is just a quick fix in my use case. Hope it merge 
soon. And I will give some comment in your PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-70992965
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25959/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-70992962
  
  [Test build #25959 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25959/consoleFull)
 for   PR 4158 at commit 
[`c8fe7fc`](https://github.com/apache/spark/commit/c8fe7fc37471c38b24e52a5d170fa0741b50c791).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-70989376
  
Hi @chenghao-intel, I already did this and support for custom field 
delimiter and SerDe in PR #4014.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4158#issuecomment-70986035
  
  [Test build #25959 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25959/consoleFull)
 for   PR 4158 at commit 
[`c8fe7fc`](https://github.com/apache/spark/commit/c8fe7fc37471c38b24e52a5d170fa0741b50c791).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5364] [SQL] HiveQL transform doesn't su...

2015-01-22 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/4158

[SPARK-5364] [SQL] HiveQL transform doesn't support the non output clause

This is a quick fix for query (in HiveContext) like:
```
SELECT transform(key + 1, value) USING '/bin/cat' FROM src
```
Ideally, we need to refactor the `ScriptTransformation`, which should 
support the custom SerDe for reader & writer. Will do that in the follow up.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark transform

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4158


commit c8fe7fc37471c38b24e52a5d170fa0741b50c791
Author: Cheng Hao 
Date:   2015-01-22T08:09:00Z

fix bug of transform in HiveQL




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org