Build failed in Hudson: Hive-trunk-h0.17 #8
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/8/changes Changes: [zshao] HIVE-270. Add a lazy-deserialized SerDe for efficient deserialization of rows with primitive types. (zshao) -- [...truncated 16709 lines...] [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_column2.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_column2.q.out [junit] Done query: unknown_column2.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180523_68094412.txt [junit] Begin query: unknown_column3.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_column3.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_column3.q.out [junit] Done query: unknown_column3.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180523_576389447.txt [junit] Begin query: unknown_column4.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_column4.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_column4.q.out [junit] Done query: unknown_column4.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180523_-1227788210.txt [junit] Begin query: unknown_column5.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_column5.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_column5.q.out [junit] Done query: unknown_column5.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180523_538024551.txt [junit] Begin query: unknown_column6.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_column6.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_column6.q.out [junit] Done query: unknown_column6.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180523_-1608223043.txt [junit] Begin query: unknown_function1.q [junit]
Build failed in Hudson: Hive-trunk-h0.18 #9
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/9/changes Changes: [zshao] HIVE-270. Add a lazy-deserialized SerDe for efficient deserialization of rows with primitive types. (zshao) -- [...truncated 19148 lines...] [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_column2.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_column2.q.out [junit] Done query: unknown_column2.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180624_-1207805739.txt [junit] Begin query: unknown_column3.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_column3.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_column3.q.out [junit] Done query: unknown_column3.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180624_-355194879.txt [junit] Begin query: unknown_column4.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_column4.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_column4.q.out [junit] Done query: unknown_column4.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180624_238035720.txt [junit] Begin query: unknown_column5.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_column5.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_column5.q.out [junit] Done query: unknown_column5.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180624_1911369666.txt [junit] Begin query: unknown_column6.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_column6.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_column6.q.out [junit] Done query: unknown_column6.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180624_-295681886.txt [junit] Begin query: unknown_function1.q [junit]
Build failed in Hudson: Hive-trunk-h0.19 #8
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/8/changes Changes: [zshao] HIVE-270. Add a lazy-deserialized SerDe for efficient deserialization of rows with primitive types. (zshao) -- [...truncated 18767 lines...] [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_column2.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_column2.q.out [junit] Done query: unknown_column2.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180724_474849503.txt [junit] Begin query: unknown_column3.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_column3.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_column3.q.out [junit] Done query: unknown_column3.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180724_1948005189.txt [junit] Begin query: unknown_column4.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_column4.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_column4.q.out [junit] Done query: unknown_column4.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180724_1836088980.txt [junit] Begin query: unknown_column5.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_column5.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_column5.q.out [junit] Done query: unknown_column5.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180724_1261990314.txt [junit] Begin query: unknown_column6.q [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11} [junit] OK [junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12} [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table srcbucket [junit] OK [junit] Loading data to table src [junit] OK [junit] diff http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_column6.q.out http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_column6.q.out [junit] Done query: unknown_column6.q [junit] Hive history file=http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/../build/ql/tmp/hive_job_log_hudson_200902180724_782453616.txt [junit] Begin query: unknown_function1.q [junit]
RE: You are voted to be a Hive committer
Congrats! I guess this means Hive can now reliably survive a massive earthquake in SF Bay Area. -Original Message- From: Dhruba Borthakur [mailto:dhr...@gmail.com] Sent: Tuesday, February 17, 2009 10:45 PM To: Johan Oskarsson Cc: hive-dev@hadoop.apache.org Subject: You are voted to be a Hive committer Hi Johan, The Hadoop PMC has voted to make you a committer for the Hive subproject. Please complete and sign the ICLA at http://www.apache.org/licenses/icla.txtand fax it to the number specified in the form. Once the form is processed, you would be granted an apache account. thanks, dhruba
Re: You are voted to be a Hive committer
Congrats Johan! On Wed, Feb 18, 2009 at 10:55 AM, Joydeep Sen Sarma jssa...@facebook.comwrote: Congrats! I guess this means Hive can now reliably survive a massive earthquake in SF Bay Area. -Original Message- From: Dhruba Borthakur [mailto:dhr...@gmail.com] Sent: Tuesday, February 17, 2009 10:45 PM To: Johan Oskarsson Cc: hive-dev@hadoop.apache.org Subject: You are voted to be a Hive committer Hi Johan, The Hadoop PMC has voted to make you a committer for the Hive subproject. Please complete and sign the ICLA at http://www.apache.org/licenses/icla.txtand fax it to the number specified in the form. Once the form is processed, you would be granted an apache account. thanks, dhruba
[jira] Commented: (HIVE-74) Hive can use CombineFileInputFormat for when the input are many small files
[ https://issues.apache.org/jira/browse/HIVE-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674742#action_12674742 ] Joydeep Sen Sarma commented on HIVE-74: --- Is it possible to do this in a way that Hive continues to compile against 0.17/18/19?. I think this is almost a hard requirement. One possibility is to have a new version of HiveInputSplit that only compiles against 0.20 - and have this conditionally in the code only for 0.20 and onwards. (for example in HiveInputFormat.java - there's a conditional tag (//[exclude_0_19]) that does some conditional code inclusion). I am not sure how this was implemented. But even this is less than ideal. How will we deploy this with 17 (with combinefilesplit and related patches) (unless we are not using the open source version directly) Hive can use CombineFileInputFormat for when the input are many small files --- Key: HIVE-74 URL: https://issues.apache.org/jira/browse/HIVE-74 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.2.0 Attachments: hiveCombineSplit.patch, hiveCombineSplit.patch There are cases when the input to a Hive job are thousands of small files. In this case, there is a mapper for each file. Most of the overhead for spawning all these mappers can be avoided if Hive used CombineFileInputFormat introduced via HADOOP-4565 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: You are voted to be a Hive committer
Congrats Johan. The subject of the email always fool me. I see an email titled You are voted to be a Hive committer and I feel like I have won an academy award. Then I open the email to find someone else is getting one. great sorrow. JK
Re: You are voted to be a Hive committer
Hi Edward, You are absolutely right! Sorry for the confusion. dhruba On Wed, Feb 18, 2009 at 11:36 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Congrats Johan. The subject of the email always fool me. I see an email titled You are voted to be a Hive committer and I feel like I have won an academy award. Then I open the email to find someone else is getting one. great sorrow. JK
[jira] Commented: (HIVE-74) Hive can use CombineFileInputFormat for when the input are many small files
[ https://issues.apache.org/jira/browse/HIVE-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674769#action_12674769 ] Joydeep Sen Sarma commented on HIVE-74: --- where are the pools for the combinefileinputformat created (one per table)? Hive can use CombineFileInputFormat for when the input are many small files --- Key: HIVE-74 URL: https://issues.apache.org/jira/browse/HIVE-74 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.2.0 Attachments: hiveCombineSplit.patch, hiveCombineSplit.patch There are cases when the input to a Hive job are thousands of small files. In this case, there is a mapper for each file. Most of the overhead for spawning all these mappers can be avoided if Hive used CombineFileInputFormat introduced via HADOOP-4565 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks
[ https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674776#action_12674776 ] Joydeep Sen Sarma commented on HIVE-131: please commit this to 0.2 also since it's a pretty severe bug insert overwrite directory leaves behind uncommitted/tmp files from failed tasks Key: HIVE-131 URL: https://issues.apache.org/jira/browse/HIVE-131 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Priority: Critical Attachments: HIVE-131.patch.1, hive-131.patch.2 _tmp files are getting left behind on insert overwrite directory: /user/jssarma/ctst1/40422_m_000195_0.deflate r 3 13285 2008-12-07 01:47 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/40422_m_000196_0.deflate r 3 3055 2008-12-07 01:46 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup this happened with speculative execution. the code looks good (in fact in this case many speculative tasks were launched - and only a couple caused problems). Almost seems like these files did not appear in the namespace until after the map-reduce job finished and the movetask did a listing of the output dir .. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: You are voted to be a Hive committer
Thanks guys, I'll keep the project alive if California slides into the pacific :) /Johan Jeff Hammerbacher wrote: Congrats Johan! On Wed, Feb 18, 2009 at 10:55 AM, Joydeep Sen Sarma jssa...@facebook.com mailto:jssa...@facebook.com wrote: Congrats! I guess this means Hive can now reliably survive a massive earthquake in SF Bay Area. -Original Message- From: Dhruba Borthakur [mailto:dhr...@gmail.com mailto:dhr...@gmail.com] Sent: Tuesday, February 17, 2009 10:45 PM To: Johan Oskarsson Cc: hive-dev@hadoop.apache.org mailto:hive-dev@hadoop.apache.org Subject: You are voted to be a Hive committer Hi Johan, The Hadoop PMC has voted to make you a committer for the Hive subproject. Please complete and sign the ICLA at http://www.apache.org/licenses/icla.txtand fax it to the number specified in the form. Once the form is processed, you would be granted an apache account. thanks, dhruba
Re: Need help on Hive.g and parser!
Thank you. I went through antlr. Just curious -- was there any comparison done between JavaCC and antlr ? How is the quality of code generated by antlr compared to JavaCC ? This could be an issue if in future we like to embed XML or java script inside Hive QL (not very important at this point). Advanced SQL syntax embeds XML and Java scripts. Thanks, Shyam --- On Tue, 2/17/09, Zheng Shao zsh...@gmail.com wrote: From: Zheng Shao zsh...@gmail.com Subject: Re: Need help on Hive.g and parser! To: hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com Date: Tuesday, February 17, 2009, 10:01 PM We are using antlr. Basically, the rule checks the timestamp of HiveParser.java. If it's newer than Hive.g, then we don't need to regenerate HiveParse.java from Hive.g again. Zheng On Tue, Feb 17, 2009 at 12:15 PM, Shyam Sarkar shyam_sar...@yahoo.comwrote: Hello, Someone please explain the following build.xml spec for grammar build (required and not required) :: === uptodate property=grammarBuild.notRequired srcfiles dir= ${src.dir}/org/apache/hadoop/hive/ql/parse includes=**/*.g/ mapper type=merge to=${build.dir.hive}/ql/gen-java/org/apache/hadoop/hive/ql/parse/HiveParser.java/ /uptodate target name=build-grammar unless=grammarBuild.notRequired echoBuilding Grammar ${src.dir}/org/apache/hadoop/hive/ql/parse/Hive.g /echo java classname=org.antlr.Tool classpathref=classpath fork=true arg value=-fo / arg value=${build.dir.hive}/ql/gen-java/org/apache/hadoop/hive/ql/parse / arg value=${src.dir}/org/apache/hadoop/hive/ql/parse/Hive.g / /java /target = Also can someone tell me which parser generator is used? I used JavaCC in the past. Thanks, shyam_sar...@yahoo.com -- Yours, Zheng
[jira] Created: (HIVE-294) Support MAP(a.*), REDUCE(a.*) and TRANSFORM(a.*)
Support MAP(a.*), REDUCE(a.*) and TRANSFORM(a.*) Key: HIVE-294 URL: https://issues.apache.org/jira/browse/HIVE-294 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.2.0, 0.3.0 Reporter: Zheng Shao Hive language does not accept MAP(a.*), REDUCE(a.*) and TRANSFORM(a.*) now. We should support it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-131) insert overwrite directory leaves behind uncommitted/tmp files from failed tasks
[ https://issues.apache.org/jira/browse/HIVE-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-131: Resolution: Fixed Fix Version/s: 0.3.0 0.2.0 Release Note: HIVE-131. Remove uncommitted files from failed tasks. (Joydeep Sen Sarma via zshao) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) trunk: Committed revision 745709. branch-0.2: Committed revision 745710. insert overwrite directory leaves behind uncommitted/tmp files from failed tasks Key: HIVE-131 URL: https://issues.apache.org/jira/browse/HIVE-131 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Joydeep Sen Sarma Priority: Critical Fix For: 0.2.0, 0.3.0 Attachments: HIVE-131.patch.1, hive-131.patch.2 _tmp files are getting left behind on insert overwrite directory: /user/jssarma/ctst1/40422_m_000195_0.deflate r 3 13285 2008-12-07 01:47 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/40422_m_000196_0.deflate r 3 3055 2008-12-07 01:46 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_33_0 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup /user/jssarma/ctst1/_tmp.40422_m_37_1 r 3 0 2008-12-07 01:53 rw-r--r-- jssarma supergroup this happened with speculative execution. the code looks good (in fact in this case many speculative tasks were launched - and only a couple caused problems). Almost seems like these files did not appear in the namespace until after the map-reduce job finished and the movetask did a listing of the output dir .. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-276) input3_limit.q fails under 0.17
[ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-276: Attachment: HIVE-276.2.patch Incorporated Ashish's comments. input3_limit.q fails under 0.17 --- Key: HIVE-276 URL: https://issues.apache.org/jira/browse/HIVE-276 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-276.1.patch, HIVE-276.2.patch The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs: The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20. The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-276) input3_limit.q fails under 0.17
[ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674870#action_12674870 ] Raghotham Murthy commented on HIVE-276: --- +1 looks good. input3_limit.q fails under 0.17 --- Key: HIVE-276 URL: https://issues.apache.org/jira/browse/HIVE-276 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-276.1.patch, HIVE-276.2.patch The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs: The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20. The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
*UNIT TEST FAILURE for apache HIVE* Hadoop.Version=0.17.1 based on SVN Rev# 745710.54
[junit] Test org.apache.hadoop.hive.cli.TestCliDriver FAILED BUILD FAILED [junit] Test org.apache.hadoop.hive.cli.TestCliDriver FAILED BUILD FAILED
[jira] Updated: (HIVE-279) Implement predicate push down for hive queries
[ https://issues.apache.org/jira/browse/HIVE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Chakka updated HIVE-279: --- Attachment: hive-279.patch this is a drop for initial review since i suspect there will be lot of comments :). it should work for all cases except for multi-insert queries. i have not enabled this by default but added a new config param called hive.optimize.ppd to enable this feature. i have not modified existing testcases but added couple of new testcases. will add more while uploading final patch. Implement predicate push down for hive queries -- Key: HIVE-279 URL: https://issues.apache.org/jira/browse/HIVE-279 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.2.0 Reporter: Prasad Chakka Assignee: Prasad Chakka Attachments: hive-279.patch Push predicates that are expressed in outer queries into inner queries where possible so that rows will get filtered out sooner. eg. select a.*, b.* from a join b on (a.uid = b.uid) where a.age = 20 and a.gender = 'm' current compiler generates the filter predicate in the reducer after the join so all the rows have to be passed from mapper to reducer. by pushing the filter predicate to the mapper, query performance should improve. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-223) when using map-side aggregates - perform single map-reduce group-by
[ https://issues.apache.org/jira/browse/HIVE-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-223: Status: Patch Available (was: Open) fixed a small bug when using map-side aggregates - perform single map-reduce group-by --- Key: HIVE-223 URL: https://issues.apache.org/jira/browse/HIVE-223 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Namit Jain Attachments: 223.2.txt, 223.patch1.txt today even when we do map side aggregates - we do multiple map-reduce jobs. however - the reason for doing multiple map-reduce group-bys (for single group-bys) was the fear of skews. When we are doing map side aggregates - skews should not exist for the most part. There can be two reason for skews: - large number of entries for a single grouping set - map side aggregates should take care of this - badness in hash function that sends too much stuff to one reducer - we should be able to take care of this by having good hash functions (and prime number reducer counts) So i think we should be able to do a single stage map-reduce when doing map-side aggregates. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-223) when using map-side aggregates - perform single map-reduce group-by
[ https://issues.apache.org/jira/browse/HIVE-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-223: Status: Patch Available (was: Open) when using map-side aggregates - perform single map-reduce group-by --- Key: HIVE-223 URL: https://issues.apache.org/jira/browse/HIVE-223 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Namit Jain Attachments: 223.2.txt, 223.3.txt, 223.patch1.txt today even when we do map side aggregates - we do multiple map-reduce jobs. however - the reason for doing multiple map-reduce group-bys (for single group-bys) was the fear of skews. When we are doing map side aggregates - skews should not exist for the most part. There can be two reason for skews: - large number of entries for a single grouping set - map side aggregates should take care of this - badness in hash function that sends too much stuff to one reducer - we should be able to take care of this by having good hash functions (and prime number reducer counts) So i think we should be able to do a single stage map-reduce when doing map-side aggregates. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-223) when using map-side aggregates - perform single map-reduce group-by
[ https://issues.apache.org/jira/browse/HIVE-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-223: Status: Open (was: Patch Available) when using map-side aggregates - perform single map-reduce group-by --- Key: HIVE-223 URL: https://issues.apache.org/jira/browse/HIVE-223 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Namit Jain Attachments: 223.2.txt, 223.3.txt, 223.patch1.txt today even when we do map side aggregates - we do multiple map-reduce jobs. however - the reason for doing multiple map-reduce group-bys (for single group-bys) was the fear of skews. When we are doing map side aggregates - skews should not exist for the most part. There can be two reason for skews: - large number of entries for a single grouping set - map side aggregates should take care of this - badness in hash function that sends too much stuff to one reducer - we should be able to take care of this by having good hash functions (and prime number reducer counts) So i think we should be able to do a single stage map-reduce when doing map-side aggregates. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-223) when using map-side aggregates - perform single map-reduce group-by
[ https://issues.apache.org/jira/browse/HIVE-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-223: Attachment: 223.3.txt when using map-side aggregates - perform single map-reduce group-by --- Key: HIVE-223 URL: https://issues.apache.org/jira/browse/HIVE-223 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Namit Jain Attachments: 223.2.txt, 223.3.txt, 223.patch1.txt today even when we do map side aggregates - we do multiple map-reduce jobs. however - the reason for doing multiple map-reduce group-bys (for single group-bys) was the fear of skews. When we are doing map side aggregates - skews should not exist for the most part. There can be two reason for skews: - large number of entries for a single grouping set - map side aggregates should take care of this - badness in hash function that sends too much stuff to one reducer - we should be able to take care of this by having good hash functions (and prime number reducer counts) So i think we should be able to do a single stage map-reduce when doing map-side aggregates. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-291) [Hive] map-side aggregation should be automatically disabled at run-time if it is not turning out to be useful
[ https://issues.apache.org/jira/browse/HIVE-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674893#action_12674893 ] Namit Jain commented on HIVE-291: - tested one big job for correctness [Hive] map-side aggregation should be automatically disabled at run-time if it is not turning out to be useful -- Key: HIVE-291 URL: https://issues.apache.org/jira/browse/HIVE-291 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: 291.1.txt Map-side aggregation should be automatically disabled at run-time if it is not turning out to be useful. If map-side aggregation is not reducing the number of output rows, it is a drain on the mapper, since it is consuming memory and performing unnecessary hash lookups -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-223) when using map-side aggregates - perform single map-reduce group-by
[ https://issues.apache.org/jira/browse/HIVE-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674892#action_12674892 ] Namit Jain commented on HIVE-223: - tested one big job for correctness when using map-side aggregates - perform single map-reduce group-by --- Key: HIVE-223 URL: https://issues.apache.org/jira/browse/HIVE-223 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Joydeep Sen Sarma Assignee: Namit Jain Attachments: 223.2.txt, 223.3.txt, 223.patch1.txt today even when we do map side aggregates - we do multiple map-reduce jobs. however - the reason for doing multiple map-reduce group-bys (for single group-bys) was the fear of skews. When we are doing map side aggregates - skews should not exist for the most part. There can be two reason for skews: - large number of entries for a single grouping set - map side aggregates should take care of this - badness in hash function that sends too much stuff to one reducer - we should be able to take care of this by having good hash functions (and prime number reducer counts) So i think we should be able to do a single stage map-reduce when doing map-side aggregates. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
JIRA_223.3.txt_UNIT_TEST_SUCCEEDED
SUCCESS: BUILD AND UNIT TEST using PATCH 223.3.txt PASSED!!