[jira] [Commented] (BIGTOP-1450) hive smoke test : possibly out of sync, need review, and hard to debug

Roman Shaposhnik (JIRA) Mon, 20 Oct 2014 22:33:07 -0700

    [ 
https://issues.apache.org/jira/browse/BIGTOP-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177972#comment-14177972
 ]


Roman Shaposhnik commented on BIGTOP-1450:
------------------------------------------

I was one of the dudes who implemented the Hive tests in Bigtop. What we did 
was simple: we took existing tests from Hive and stuck them into Bigtop. That's 
it. Of course, in Hive the tests get maintained and they seem to have bitroted 
in Bigtop. I actually do like a suggestion of gutting them out and replacing 
with more representative smoke tests. But the proof is in the 
puddin'^H^H^H^H^Hpatch ;-)

Anyway, another thing that would be super cool is to somehow collaborate on 
making Hive tests from Apache Hive projects itself be able to execute against a 
real cluster. That's what Pig lets us do for example. Any takers?

> hive smoke test : possibly out of sync, need review, and hard to debug
> ----------------------------------------------------------------------
>
>                 Key: BIGTOP-1450
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1450
>             Project: Bigtop
>          Issue Type: Improvement
>          Components: tests
>            Reporter: jay vyas
>            Assignee: jay vyas
>             Fix For: 0.9.0
>
>
> *Overall: The hive tests in {{test-artifacts}} are prone to failures from 
> missing data sets and generally need a thorough review*
> When testing bigtop 0.8.0 release candidate, I found that I got some errors 
> {noformat}
> [--- /dev/fd/63  2014-09-16 10:12:54.579647323 +0000, +++ /dev/fd/62     
> 2014-09-16 10:12:54.579647323 +0000, @@ -14,4 +14,4 @@,  INSERT OVERWRITE 
> DIRECTORY '/tmp/count',  SELECT COUNT(1) FROM u_data,  dfs -cat /tmp/count/*, 
> -0, +100000] err=[14/09/16 10:12:17 WARN mapred.JobConf: The variable 
> mapred.child.ulimit is no longer used., , Logging initialized using 
> configuration in file:/etc/hive/conf.dist/hive-log4j.properties, OK, Time 
> taken: 2.609 seconds, OK, Time taken: 0.284 seconds, Total jobs = 1, 
> Launching Job 1 out of 1, Number of reduce tasks determined at compile time: 
> 1, In order to change the average load for a reducer (in bytes):,   set 
> hive.exec.reducers.bytes.per.reducer=&lt;number&gt;, In order to limit the 
> maximum number of reducers:,   set hive.exec.reducers.max=&lt;number&gt;, In 
> order to set a constant number of reducers:,   set 
> mapreduce.job.reduces=&lt;number&gt;, Starting Job = job_1410830363557_0019, 
> Tracking URL = 
> http://bigtop1.vagrant:20888/proxy/application_1410830363557_0019/, Kill 
> Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1410830363557_0019, 
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1, 2014-09-16 10:12:38,870 Stage-1 map = 0%,  reduce = 0%, 2014-09-16 
> 10:12:45,516 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.81 sec, 
> 2014-09-16 10:12:53,036 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 1.73 sec, MapReduce Total cumulative CPU time: 1 seconds 730 msec, Ended Job 
> = job_1410830363557_0019, Moving data to: /tmp/count, MapReduce Jobs 
> Launched: , Job 0: Map: 1  Reduce: 1   Cumulative CPU: 1.73 sec   HDFS Read: 
> 272 HDFS Write: 2 SUCCESS, Total MapReduce CPU Time Spent: 1 seconds 730 
> msec, OK, Time taken: 24.594 seconds
> {noformat}
> I know there is a diff error in here - some kind of diff is going on , but I 
> forgot how the actual,output,and filter are working.  
> In any case, I think these tests can be simplified to just grep for a output 
> string and check error code, or else, at least add some very clear assertions 
> as to what failures may be. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (BIGTOP-1450) hive smoke test : possibly out of sync, need review, and hard to debug

Reply via email to