[
https://issues.apache.org/jira/browse/HIVE-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107442#comment-14107442
]
Brock Noland commented on HIVE-7702:
------------------------------------
Thank you Chinna!! I agree, I used the script below and all of the result
differences are due to sorting. Thank you!
+1
{noformat}
#!/bin/bash
while read file
do
mr=$(echo $file | perl -pe "s@/spark@@g")
spark=$file
mrSorted=/tmp/$(basename $mr)-mr.sorted
sparkSorted=/tmp/$(basename $spark)-spark.sorted
sort $mr > $mrSorted
sort $spark > $sparkSorted
diff -y -W 150 $mrSorted $sparkSorted
done
{noformat}
> Start running .q file tests on spark [Spark Branch]
> ---------------------------------------------------
>
> Key: HIVE-7702
> URL: https://issues.apache.org/jira/browse/HIVE-7702
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Brock Noland
> Assignee: Chinna Rao Lalam
> Attachments: HIVE-7702-spark.patch, HIVE-7702.1-spark.patch
>
>
> Spark can currently only support a few queries, however there are some .q
> file tests which will pass today. The basic idea is that we should get some
> number of these actually working (10-20) so we can actually start testing the
> project.
> A good starting point might be the udf*, varchar*, or alter* tests:
> https://github.com/apache/hive/tree/spark/ql/src/test/queries/clientpositive
> To generate the output file for test XXX.q, you'd do:
> {noformat}
> mvn clean install -DskipTests -Phadoop-2
> cd itests
> mvn clean install -DskipTests -Phadoop-2
> cd qtest-spark
> mvn test -Dtest=TestCliDriver -Dqfile=XXX.q -Dtest.output.overwrite=true
> -Phadoop-2
> {noformat}
> which would generate XXX.q.out which we can check-in to source control as a
> "golden file".
> Multiple tests can be run at a give time as so:
> {noformat}
> mvn test -Dtest=TestCliDriver -Dqfile=X1.q,X2.q -Dtest.output.overwrite=true
> -Phadoop-2
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)