Build failed in Hudson: Hive-trunk-h0.17 #364
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/364/ -- Started by timer Building remotely on minerva.apache.org (Ubuntu) Updating http://svn.apache.org/repos/asf/hadoop/hive/trunk U eclipse-templates/.settings/org.eclipse.jdt.core.prefs U eclipse-templates/.settings/org.eclipse.jdt.ui.prefs U conf/hive-default.xml U CHANGES.txt U common/src/java/org/apache/hadoop/hive/conf/HiveConf.java U build.xml U checkstyle/checkstyle.xml U contrib/src/java/org/apache/hadoop/hive/contrib/mr/GenericMR.java U .checkstyle U ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java U ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java At revision 910507 no revision recorded for http://svn.apache.org/repos/asf/hadoop/hive/trunk in the previous build [hive] $ /home/hudson/tools/ant/latest/bin/ant -Dhadoop.version=0.17.2.1 clean package javadoc test Buildfile: build.xml clean: clean: [echo] Cleaning: anttasks [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks clean: [echo] Cleaning: shims [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/shims clean: [echo] Cleaning: common clean: [echo] Cleaning: serde clean: [echo] Cleaning: metastore clean: [echo] Cleaning: ql clean: [echo] Cleaning: cli clean: [echo] Cleaning: contrib clean: clean: clean: [echo] Cleaning: hwi clean: [exec] rm -rf http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/odbc http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/service/objs http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/service/fb303/objs http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/metastore/objs clean-online: [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build clean-offline: jar: create-dirs: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/shims [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/shims/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/jexl/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/hadoopcore [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/shims/test [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/shims/test/src [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/shims/test/classes compile-ant-tasks: create-dirs: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/test [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/test/src [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/test/classes init: compile: [echo] Compiling: anttasks [javac] Compiling 2 source files to http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/classes [javac] Note: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ant/src/org/apache/hadoop/hive/ant/QTestGenTask.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. deploy-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks jar: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/classes/org/apache/hadoop/hive/ant [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/anttasks/hive-anttasks-0.6.0.jar init: compile: ivy-init-dirs: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ivy [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ivy/lib [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ivy/report [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ivy/maven ivy-download: [get] Getting:
Build failed in Hudson: Hive-trunk-h0.18 #367
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/367/changes Changes: [zshao] HIVE-1158. Introducing a new parameter for Map-side join bucket size. (Ning Zhang via zshao) [zshao] HIVE-1147. Update Eclipse project configuration to match Checkstyle (Carl Steinbach via zshao) -- [...truncated 2972 lines...] A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPower.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPEqualOrGreaterThan.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericOp.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitNot.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNot.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPosMod.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNegative.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMinute.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitXor.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPEqual.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFConcat.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSecond.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericUnaryOp.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMod.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPLongDivide.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRegExpExtract.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFFromUnixTime.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateSub.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNotEqual.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTF.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java AUql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIndex.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNull.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSize.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/UDTFCollector.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStruct.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCount.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNotNull.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVarianceSample.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBridge.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBridge.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEvaluator.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLocate.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCase.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMap.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArray.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCoalesce.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFField.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFElt.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/Collector.java A
Build failed in Hudson: Hive-trunk-h0.19 #367
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/367/changes Changes: [zshao] HIVE-1158. Introducing a new parameter for Map-side join bucket size. (Ning Zhang via zshao) [zshao] HIVE-1147. Update Eclipse project configuration to match Checkstyle (Carl Steinbach via zshao) -- [...truncated 2972 lines...] A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPower.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPEqualOrGreaterThan.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericOp.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitNot.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNot.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPosMod.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNegative.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMinute.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPBitXor.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPEqual.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFConcat.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSecond.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericUnaryOp.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMod.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPLongDivide.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRegExpExtract.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFFromUnixTime.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateSub.java AUql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNotEqual.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java A ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTF.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java AUql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIndex.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNull.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSize.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/UDTFCollector.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStruct.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCount.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNotNull.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVarianceSample.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBridge.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBridge.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEvaluator.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLocate.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCase.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMap.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArray.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCoalesce.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFField.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFElt.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/Collector.java A
[jira] Commented: (HIVE-1117) Make QueryPlan serializable
[ https://issues.apache.org/jira/browse/HIVE-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834360#action_12834360 ] Namit Jain commented on HIVE-1117: -- TestParse failed - can you update the outputs for TestParse results ? Make QueryPlan serializable --- Key: HIVE-1117 URL: https://issues.apache.org/jira/browse/HIVE-1117 Project: Hadoop Hive Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Fix For: 0.6.0 Attachments: HIVE-1117.1.code.patch, HIVE-1117.1.test.patch We need to make QueryPlan serializable so that we can resume the query some time later. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1117) Make QueryPlan serializable
[ https://issues.apache.org/jira/browse/HIVE-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834372#action_12834372 ] Namit Jain commented on HIVE-1117: -- +1 will commit if the tests pass Make QueryPlan serializable --- Key: HIVE-1117 URL: https://issues.apache.org/jira/browse/HIVE-1117 Project: Hadoop Hive Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Fix For: 0.6.0 Attachments: HIVE-1117.1.code.patch, HIVE-1117.1.test.patch We need to make QueryPlan serializable so that we can resume the query some time later. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-917) Bucketed Map Join
[ https://issues.apache.org/jira/browse/HIVE-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-917: -- Attachment: hive-917-2010-2-16.patch A new patch: 1) added explain extended 2) break the test to 4 tests. Bucketed Map Join - Key: HIVE-917 URL: https://issues.apache.org/jira/browse/HIVE-917 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: He Yongqiang Attachments: hive-917-2010-2-15.patch, hive-917-2010-2-16.patch, hive-917-2010-2-3.patch, hive-917-2010-2-8.patch Hive already have support for map-join. Map-join treats the big table as job input, and in each mapper, it loads all data from a small table. In case the big table is already bucketed on the join key, we don't have to load the whole small table in each of the mappers. This will greatly alleviate the memory pressure, and make map-join work with medium-sized tables. There are 4 steps we can improve: S0. This is what the user can already do now: create a new bucketed table and insert all data from the small table to it; Submit BUCKETNUM jobs, each doing a map-side join of bigtable TABLEPARTITION(BUCKET i OUT OF NBUCKETS) with smallbucketedtable TABLEPARTITION(BUCKET i OUT OF NBUCKETS). S1. Change the code so that when map-join is loading the small table, we automatically drop the rows with the keys that are NOT in the same bucket as the big table. This should alleviate the problem on memory, but we might still have thousands of mappers reading the whole of the small table. S2. Let's say the user already bucketed the small table on the join key into exactly the same number of buckets (or a factor of the buckets of the big table), then map-join can choose to load only the buckets that are useful. S3. Add a new hint (e.g. /*+ MAPBUCKETJOIN(a) */), so that Hive automatically does S2, without the need of asking the user to create temporary bucketed table for the small table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1168) Fix Hive build on Hudson
[ https://issues.apache.org/jira/browse/HIVE-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834423#action_12834423 ] John Sichi commented on HIVE-1168: -- I am following up on some leads on who might have access to the Hudson build environment and will update status here once I get an answer. Fix Hive build on Hudson Key: HIVE-1168 URL: https://issues.apache.org/jira/browse/HIVE-1168 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: John Sichi Priority: Critical {quote} We need to delete the .ant directory containing the old ivy version in order to fix it (and if we're using the same environment for both trunk and branches, either segregate them or script an rm to clean in between). {quote} It's worth noting that ant may have picked up the old version of Ivy from somewhere else. In order Ant's classpath contains: # Ant's startup JAR file, ant-launcher.jar # Everything in the directory containing the version of ant-launcher.jar that's running, i.e. everything in ANT_HOME/lib # All JAR files in ${user.home}/.ant/lib # Directories and JAR files supplied via the -lib command line option. # Everything in the CLASSPATH variable unless the -noclasspath option is used. (2) implies that users on shared machines may have to install their own version of ant in order to get around these problems, assuming that the administrator has install the ivy.jar in $ANT_HOME/lib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1168) Fix Hive build on Hudson
[ https://issues.apache.org/jira/browse/HIVE-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834433#action_12834433 ] John Sichi commented on HIVE-1168: -- Mailing list thread quoted by Carl is here: http://mail-archives.apache.org/mod_mbox/hadoop-hive-dev/201002.mbox/%3c7b4f4fe5-d478-4d67-a30e-d3e88e744...@facebook.com%3e Fix Hive build on Hudson Key: HIVE-1168 URL: https://issues.apache.org/jira/browse/HIVE-1168 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: John Sichi Priority: Critical {quote} We need to delete the .ant directory containing the old ivy version in order to fix it (and if we're using the same environment for both trunk and branches, either segregate them or script an rm to clean in between). {quote} It's worth noting that ant may have picked up the old version of Ivy from somewhere else. In order Ant's classpath contains: # Ant's startup JAR file, ant-launcher.jar # Everything in the directory containing the version of ant-launcher.jar that's running, i.e. everything in ANT_HOME/lib # All JAR files in ${user.home}/.ant/lib # Directories and JAR files supplied via the -lib command line option. # Everything in the CLASSPATH variable unless the -noclasspath option is used. (2) implies that users on shared machines may have to install their own version of ant in order to get around these problems, assuming that the administrator has install the ivy.jar in $ANT_HOME/lib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1173) Partition pruner cancels pruning if non-deterministic function present in filtering expression only in joins is present in query
[ https://issues.apache.org/jira/browse/HIVE-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834438#action_12834438 ] Zheng Shao commented on HIVE-1173: -- Can you try condition: part = 'part1' AND part UDF2('part0') The optimizer might do something different because of the short-circuit calculation of AND. Partition pruner cancels pruning if non-deterministic function present in filtering expression only in joins is present in query Key: HIVE-1173 URL: https://issues.apache.org/jira/browse/HIVE-1173 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.4.0, 0.4.1 Reporter: Vladimir Klimontovich Brief description: case 1) non-deterministic present in partition condition, joins are present in query = partition pruner doesn't do filtering of partitions based on condition case 2) non-deterministic present in partition condition, joins aren't present in query = partition pruner do filtering of partitions based on condition It's quite illogical when pruning depends on presence of joins in query. Example: Let's consider following sequence of hive queries: 1) Create non-deterministic function: create temporary function UDF2 as 'UDF2'; {{ import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.hive.ql.udf.UDFType; @UDFType(deterministic=false) public class UDF2 extends UDF { public String evaluate(String val) { return val; } } }} 2) Create tables CREATE TABLE Main ( a STRING, b INT ) PARTITIONED BY(part STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '10' STORED AS TEXTFILE; ALTER TABLE Main ADD PARTITION (part=part1) LOCATION /hive-join-test/part1/; ALTER TABLE Main ADD PARTITION (part=part2) LOCATION /hive-join-test/part2/; CREATE TABLE Joined ( a STRING, f STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '/hive-join-test/join/'; 3) Run first query: select m.a, m.b from Main m where part UDF2('part0') AND part = 'part1'; The pruner will work for this query: mapred.input.dir=hdfs://localhost:9000/hive-join-test/part1 4) Run second query (with join): select m.a, j.a, m.b from Main m join Joined j on j.a=m.a where part UDF2('part0') AND part = 'part1'; Pruner doesn't work: mapred.input.dir=hdfs://localhost:9000/hive-join-test/part1,hdfs://localhost:9000/hive-join-test/part2,hdfs://localhost:9000/hive-join-test/join 5) Also lets try to run query with MAPJOIN hint select /*+MAPJOIN(j)*/ m.a, j.a, m.b from Main m join Joined j on j.a=m.a where part UDF2('part0') AND part = 'part1'; The result is the same, pruner doesn't work: mapred.input.dir=hdfs://localhost:9000/hive-join-test/part1,hdfs://localhost:9000/hive-join-test/part2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1134) bucketing mapjoin where the big table contains more than 1 big table
[ https://issues.apache.org/jira/browse/HIVE-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834446#action_12834446 ] Namit Jain commented on HIVE-1134: -- Some pending work from https://issues.apache.org/jira/browse/HIVE-917 - you can do that in separate jira if you want to. 1. Add the mapping in explain plan so that it can be compared - look at https://issues.apache.org/jira/browse/HIVE-976 2. Add a negative test - the number of buckets in the 2 tables are not exact multiples of each other. I mean, bucketed map join will not be used. 3. Instead of checking at runtime, set the defultbucketmatcher in the plan and initialize it using reflection bucketing mapjoin where the big table contains more than 1 big table Key: HIVE-1134 URL: https://issues.apache.org/jira/browse/HIVE-1134 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: He Yongqiang Fix For: 0.6.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-917) Bucketed Map Join
[ https://issues.apache.org/jira/browse/HIVE-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834448#action_12834448 ] Namit Jain commented on HIVE-917: - Added some more tasks in the follow-up jira after talking to Yongqiang. Will commit this if the tests pass Bucketed Map Join - Key: HIVE-917 URL: https://issues.apache.org/jira/browse/HIVE-917 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: He Yongqiang Attachments: hive-917-2010-2-15.patch, hive-917-2010-2-16.patch, hive-917-2010-2-3.patch, hive-917-2010-2-8.patch Hive already have support for map-join. Map-join treats the big table as job input, and in each mapper, it loads all data from a small table. In case the big table is already bucketed on the join key, we don't have to load the whole small table in each of the mappers. This will greatly alleviate the memory pressure, and make map-join work with medium-sized tables. There are 4 steps we can improve: S0. This is what the user can already do now: create a new bucketed table and insert all data from the small table to it; Submit BUCKETNUM jobs, each doing a map-side join of bigtable TABLEPARTITION(BUCKET i OUT OF NBUCKETS) with smallbucketedtable TABLEPARTITION(BUCKET i OUT OF NBUCKETS). S1. Change the code so that when map-join is loading the small table, we automatically drop the rows with the keys that are NOT in the same bucket as the big table. This should alleviate the problem on memory, but we might still have thousands of mappers reading the whole of the small table. S2. Let's say the user already bucketed the small table on the join key into exactly the same number of buckets (or a factor of the buckets of the big table), then map-join can choose to load only the buckets that are useful. S3. Add a new hint (e.g. /*+ MAPBUCKETJOIN(a) */), so that Hive automatically does S2, without the need of asking the user to create temporary bucketed table for the small table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-259) Add PERCENTILE aggregate function
[ https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834474#action_12834474 ] Zheng Shao commented on HIVE-259: - Is there any limitation on what can be used on the state object or can we use any java Object? We support primitive classes, HashMap (translated into map type in Hive), ArrayList (array type in Hive), and any simple struct-like classes (struct type in Hive). We support arbitrary levels of nesting, but no recursive types. Also how is the state serialized between Map and Reduce? We use SerDe (see SerDe.serialize(...) ) to serialize/deserialize the objects, as well as translations between objects that have the same type (see ObjectInspector and ObjectInspectorConverters). Add PERCENTILE aggregate function - Key: HIVE-259 URL: https://issues.apache.org/jira/browse/HIVE-259 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer Assignee: Jerome Boulon Attachments: HIVE-259.1.patch, HIVE-259.patch Compute atleast 25, 50, 75th percentiles -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1136) add type-checking setters for HiveConf class to match existing getters
[ https://issues.apache.org/jira/browse/HIVE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1136: - Status: Patch Available (was: Open) This passed tests so I'm submitting it as ready. add type-checking setters for HiveConf class to match existing getters -- Key: HIVE-1136 URL: https://issues.apache.org/jira/browse/HIVE-1136 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.6.0 Attachments: HIVE-1136.1.patch This is a followup from HIVE-1129. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1136) add type-checking setters for HiveConf class to match existing getters
[ https://issues.apache.org/jira/browse/HIVE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834482#action_12834482 ] Zheng Shao commented on HIVE-1136: -- +1 Will test and commit. add type-checking setters for HiveConf class to match existing getters -- Key: HIVE-1136 URL: https://issues.apache.org/jira/browse/HIVE-1136 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.6.0 Attachments: HIVE-1136.1.patch This is a followup from HIVE-1129. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
[ https://issues.apache.org/jira/browse/HIVE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1174: --- Status: Patch Available (was: Open) Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-1174.1.patch if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
[ https://issues.apache.org/jira/browse/HIVE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1174: --- Attachment: hive-1174.1.patch Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-1174.1.patch if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-984) Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
[ https://issues.apache.org/jira/browse/HIVE-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834519#action_12834519 ] John Sichi commented on HIVE-984: - I spoke with Zheng about this and here's what we came up with. Carl, let me know if this works for you. * If at all possible, we want to keep building all supported shims as part of ant package to make sure that when a change breaks one, the developer finds out early (before even submitting a bad patch) * The long term plan does involve deprecating and eventually dropping support for older Hadoop versions. The fact that Facebook still has some dependencies on 0.17 probably explains why that is currently the oldest version, but the standard voting procedure can be used at the project level for initiating a deprecation process going forward. * Regardless of how many Hadoop versions we support, the current Hadoop+ivy situation is definitely broken, and we need to fix it ASAP since it can be a major impediment to new or existing contributors. * Before doing anything else, I'm going to see if a more reliable source than archive.apache.org would address the problem. I'll test this with my home network tomorrow, which usually fails with archive.apache.org. * If a more reliable source would help, then we'll see if we can get mirror.facebook.net to provide all supported Hadoop versions (currently only apache.archive.org has the old ones), and if that's the case, then we'll check in a change to build.properties to make it the default source. * If either of the above is not the case, then we can do what you proposed in HIVE-1171 (check the Hadoop dependencies into svn instead). Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: --- Key: HIVE-984 URL: https://issues.apache.org/jira/browse/HIVE-984 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-984.2.patch, HIVE-984.patch Folks keep running into this problem when building Hive from source: {noformat} [ivy:retrieve] [ivy:retrieve] :: problems summary :: [ivy:retrieve] WARNINGS [ivy:retrieve] [FAILED ] hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 (138662ms) [ivy:retrieve] [FAILED ] hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 (138662ms) [ivy:retrieve] hadoop-resolver: tried [ivy:retrieve] http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz [ivy:retrieve] :: [ivy:retrieve] :: FAILED DOWNLOADS:: [ivy:retrieve] :: ^ see resolution messages for details ^ :: [ivy:retrieve] :: [ivy:retrieve] :: hadoop#core;0.20.1!hadoop.tar.gz(source) [ivy:retrieve] :: [ivy:retrieve] [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS {noformat} The problem appears to be either with a) the Hive build scripts, b) ivy, or c) archive.apache.org Besides fixing the actual bug, one other option worth considering is to add the Hadoop jars to the Hive source repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1136) add type-checking setters for HiveConf class to match existing getters
[ https://issues.apache.org/jira/browse/HIVE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1136: - Status: Open (was: Patch Available) add type-checking setters for HiveConf class to match existing getters -- Key: HIVE-1136 URL: https://issues.apache.org/jira/browse/HIVE-1136 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.6.0 Attachments: HIVE-1136.1.patch This is a followup from HIVE-1129. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1136) add type-checking setters for HiveConf class to match existing getters
[ https://issues.apache.org/jira/browse/HIVE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834526#action_12834526 ] Zheng Shao commented on HIVE-1136: -- It seems that the following function is not in hadoop 0.17. What about convert it to a string first? (and put a comment saying we did this for compatibility for hadoop 0.17) {code} common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:303: cannot find symbol [javac] symbol : method setFloat(java.lang.String,float) [javac] location: class org.apache.hadoop.conf.Configuration [javac] conf.setFloat(var.varname, val); [javac] ^ {code} add type-checking setters for HiveConf class to match existing getters -- Key: HIVE-1136 URL: https://issues.apache.org/jira/browse/HIVE-1136 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.6.0 Attachments: HIVE-1136.1.patch This is a followup from HIVE-1129. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
[ https://issues.apache.org/jira/browse/HIVE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1174: --- Attachment: hive-1174.2.patch Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-1174.1.patch, hive-1174.2.patch if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
[ https://issues.apache.org/jira/browse/HIVE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834529#action_12834529 ] Zheng Shao commented on HIVE-1174: -- +1. Will test and commit. Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-1174.1.patch, hive-1174.2.patch if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
[ https://issues.apache.org/jira/browse/HIVE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834536#action_12834536 ] He Yongqiang commented on HIVE-1174: The problem in this jira is: 1) countJobs in Driver only count map-reduce tasks. 2) For mergeJob, there are two tasks (one dummyMove, and one merge task which is a MRTask). It will count 1 job because dummyMove is not a MapReduceTask 3) But a bug in ConditionalTask will always inc jobCounter by 1 when removing a task from candidate task list, even though it is not a MRTask, Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-1174.1.patch, hive-1174.2.patch if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-917) Bucketed Map Join
[ https://issues.apache.org/jira/browse/HIVE-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain resolved HIVE-917. - Resolution: Fixed Fix Version/s: 0.6.0 Hadoop Flags: [Reviewed] Committed. Thanks Yongqiang Bucketed Map Join - Key: HIVE-917 URL: https://issues.apache.org/jira/browse/HIVE-917 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: He Yongqiang Fix For: 0.6.0 Attachments: hive-917-2010-2-15.patch, hive-917-2010-2-16.patch, hive-917-2010-2-3.patch, hive-917-2010-2-8.patch Hive already have support for map-join. Map-join treats the big table as job input, and in each mapper, it loads all data from a small table. In case the big table is already bucketed on the join key, we don't have to load the whole small table in each of the mappers. This will greatly alleviate the memory pressure, and make map-join work with medium-sized tables. There are 4 steps we can improve: S0. This is what the user can already do now: create a new bucketed table and insert all data from the small table to it; Submit BUCKETNUM jobs, each doing a map-side join of bigtable TABLEPARTITION(BUCKET i OUT OF NBUCKETS) with smallbucketedtable TABLEPARTITION(BUCKET i OUT OF NBUCKETS). S1. Change the code so that when map-join is loading the small table, we automatically drop the rows with the keys that are NOT in the same bucket as the big table. This should alleviate the problem on memory, but we might still have thousands of mappers reading the whole of the small table. S2. Let's say the user already bucketed the small table on the join key into exactly the same number of buckets (or a factor of the buckets of the big table), then map-join can choose to load only the buckets that are useful. S3. Add a new hint (e.g. /*+ MAPBUCKETJOIN(a) */), so that Hive automatically does S2, without the need of asking the user to create temporary bucketed table for the small table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-984) Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
[ https://issues.apache.org/jira/browse/HIVE-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834544#action_12834544 ] Carl Steinbach commented on HIVE-984: - bq. If at all possible, we want to keep building all supported shims as part of ant package to make sure that when a change breaks one, the developer finds out early (before even submitting a bad patch) Unless your change specifically mucks with the shim code I think it's unlikely that you're going to introduce a compile time error. It seems more likely that you would cause a test error, and that's something you will only catch if you run the full test suite against all supported versions -- something that we only expect Hudson to do. Which brings up another point. How do we configure JIRA/Hudson to automatically test submitted patches? The Hadoop and Pig projects are both setup to do this, but I can't find any references to how it was done. Do either of you know how to set this up, or have objections to doing so? bq. Before doing anything else, I'm going to see if a more reliable source than archive.apache.org would address the problem. I'll test this with my home network tomorrow, which usually fails with archive.apache.org. Over the weekend I figured out that there are actually two different reasons why people are encountering errors during the download process, and wanted to make sure that everyone else is aware of this as well: # Unable to connect to archive.apache.org: We can fix this by adding additional apache mirrors (see http://www.apache.org/mirrors/) to the hadoop-source resolver in ivysettings, and also by letting people know that they can explicitly set the mirror location using the hadoop.mirror property. # -Dhadoop.version=0.20.1: When people set hadoop.version to 0.20.1 it causes ant to download both 0.20.0 *and* 0.20.1, which is unnecessary since the API does not change between patch releases. But the bigger problem is that 0.20.1's md5 checksum file on archive.apache.org contains an md5 hash along with a bunch of other garbage that breaks ivy. We can fix this either by disabling checksums for archive.apache.org (set ivy.checksums= on that resolver), or by enhancing the build script so that it ignores patch release numbers and maps 0.20.1 to 0.20.0. Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: --- Key: HIVE-984 URL: https://issues.apache.org/jira/browse/HIVE-984 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-984.2.patch, HIVE-984.patch Folks keep running into this problem when building Hive from source: {noformat} [ivy:retrieve] [ivy:retrieve] :: problems summary :: [ivy:retrieve] WARNINGS [ivy:retrieve] [FAILED ] hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 (138662ms) [ivy:retrieve] [FAILED ] hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 (138662ms) [ivy:retrieve] hadoop-resolver: tried [ivy:retrieve] http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz [ivy:retrieve] :: [ivy:retrieve] :: FAILED DOWNLOADS:: [ivy:retrieve] :: ^ see resolution messages for details ^ :: [ivy:retrieve] :: [ivy:retrieve] :: hadoop#core;0.20.1!hadoop.tar.gz(source) [ivy:retrieve] :: [ivy:retrieve] [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS {noformat} The problem appears to be either with a) the Hive build scripts, b) ivy, or c) archive.apache.org Besides fixing the actual bug, one other option worth considering is to add the Hadoop jars to the Hive source repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1134) bucketing mapjoin where the big table contains more than 1 big partition
[ https://issues.apache.org/jira/browse/HIVE-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1134: --- Summary: bucketing mapjoin where the big table contains more than 1 big partition (was: bucketing mapjoin where the big table contains more than 1 big table) bucketing mapjoin where the big table contains more than 1 big partition Key: HIVE-1134 URL: https://issues.apache.org/jira/browse/HIVE-1134 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: He Yongqiang Fix For: 0.6.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1163) Eclipse launchtemplate changes to enable debugging
[ https://issues.apache.org/jira/browse/HIVE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1163: - Attachment: HIVE-1163_3.patch Hi Carl, I've made the changes JAVA_HOME and README.txt suggested by you. I tested the JAVA_HOME change in Linux and it doesn't seem to work (unset JAVA_HOME and launch Eclipse for debugging). Can you try it on Mac and see if it works? Eclipse launchtemplate changes to enable debugging -- Key: HIVE-1163 URL: https://issues.apache.org/jira/browse/HIVE-1163 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.6.0 Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-1163.patch, HIVE-1163_2.patch, HIVE-1163_3.patch Some recent changes in the build.xml and build-common.xml breaks the debugging functionality in eclipse. Some system defined properties were missing when running eclipse debugger. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-984) Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
[ https://issues.apache.org/jira/browse/HIVE-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834558#action_12834558 ] John Sichi commented on HIVE-984: - Shims: you may be right, but I guess the principle is that all source code checked in ought to be covered by the build if possible. It's arguable that we should actually do even more in this respect (rather than less), since for example in HIVE-1136 we just hit a case where one of my changes was incompatible with an old Hadoop version (nothing to do with shims). If we built against all supported Hadoop versions as part of ant test, this would have been caught when I ran tests myself (so Zheng would never have had to spend time testing my bad patch and rejecting it). ant test might be a reasonable place for that, since test time will always be orders of magnitude longer than build time. (But note: I'm not proposing to run tests on all versions except in Hudson!) Hudson automatically testing patches: I don't know the answer to that one, but it sounds like a very high-value automation to me if the resources are available, and my opinion on the version download issue might change if this were working reliably with permanently committed resources. archive.apache.org: the default mirroring for Hadoop seems to be 0.18.3, 0.19.2, and 0.20.1 (that's what I see when I browse most of the mirrors), which doesn't match what Hive currently wants (0.17.2.1, 0.18.3, 0.19.0, and 0.20.0). That's why I was thinking we might need a custom setup on mirror.facebook.net. Building Hive occasionally fails with Ivy error: hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: --- Key: HIVE-984 URL: https://issues.apache.org/jira/browse/HIVE-984 Project: Hadoop Hive Issue Type: Bug Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-984.2.patch, HIVE-984.patch Folks keep running into this problem when building Hive from source: {noformat} [ivy:retrieve] [ivy:retrieve] :: problems summary :: [ivy:retrieve] WARNINGS [ivy:retrieve] [FAILED ] hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 (138662ms) [ivy:retrieve] [FAILED ] hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5: expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72 (138662ms) [ivy:retrieve] hadoop-resolver: tried [ivy:retrieve] http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz [ivy:retrieve] :: [ivy:retrieve] :: FAILED DOWNLOADS:: [ivy:retrieve] :: ^ see resolution messages for details ^ :: [ivy:retrieve] :: [ivy:retrieve] :: hadoop#core;0.20.1!hadoop.tar.gz(source) [ivy:retrieve] :: [ivy:retrieve] [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS {noformat} The problem appears to be either with a) the Hive build scripts, b) ivy, or c) archive.apache.org Besides fixing the actual bug, one other option worth considering is to add the Hadoop jars to the Hive source repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1136) add type-checking setters for HiveConf class to match existing getters
[ https://issues.apache.org/jira/browse/HIVE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834562#action_12834562 ] John Sichi commented on HIVE-1136: -- Per discussion with Zheng, added a new shim method to deal with the incompabitibility. add type-checking setters for HiveConf class to match existing getters -- Key: HIVE-1136 URL: https://issues.apache.org/jira/browse/HIVE-1136 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.6.0 Attachments: HIVE-1136.1.patch, HIVE-1136.2.patch This is a followup from HIVE-1129. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1163) Eclipse launchtemplate changes to enable debugging
[ https://issues.apache.org/jira/browse/HIVE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1163: - Attachment: HIVE-1163.4.patch * Fixed the JAVA_HOME problem. * Disabled automatic appending of Eclipse system environment to test environment. * Adjusted the formatting in README.txt and tweaked the Eclipse instructions. * Added a HiveCLI launch configuration for running the CLI from within Eclipse. I'm running the tests right now. Things look good so far. Eclipse launchtemplate changes to enable debugging -- Key: HIVE-1163 URL: https://issues.apache.org/jira/browse/HIVE-1163 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.6.0 Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-1163.4.patch, HIVE-1163.patch, HIVE-1163_2.patch, HIVE-1163_3.patch Some recent changes in the build.xml and build-common.xml breaks the debugging functionality in eclipse. Some system defined properties were missing when running eclipse debugger. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1174) Job counter error if hive.merge.mapfiles equals true
[ https://issues.apache.org/jira/browse/HIVE-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-1174: - Resolution: Fixed Fix Version/s: 0.6.0 Release Note: HIVE-1174. Fix Job counter error if hive.merge.mapfiles equals true. (Yongqiang He via zshao) Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Yongqiang! Job counter error if hive.merge.mapfiles equals true -- Key: HIVE-1174 URL: https://issues.apache.org/jira/browse/HIVE-1174 Project: Hadoop Hive Issue Type: Bug Reporter: He Yongqiang Assignee: He Yongqiang Fix For: 0.6.0 Attachments: hive-1174.1.patch, hive-1174.2.patch if hive.merge.mapfiles is set to true, the job counter will go to 3. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1136) add type-checking setters for HiveConf class to match existing getters
[ https://issues.apache.org/jira/browse/HIVE-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834630#action_12834630 ] Zheng Shao commented on HIVE-1136: -- +1. Will test and commit. add type-checking setters for HiveConf class to match existing getters -- Key: HIVE-1136 URL: https://issues.apache.org/jira/browse/HIVE-1136 Project: Hadoop Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.6.0 Attachments: HIVE-1136.1.patch, HIVE-1136.2.patch This is a followup from HIVE-1129. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [VOTE] release hive 0.5.0
+1 On Mon, Feb 15, 2010 at 1:41 PM, Zheng Shao zsh...@gmail.com wrote: Hive branch 0.5 was created 5 weeks ago: https://svn.apache.org/viewvc/hadoop/hive/branches/branch-0.5/ It has also been running as the production version of Hive at Facebook for 2 weeks. We'd like to start making release candidates (for 0.5.0) from branch 0.5. Please vote. -- Yours, Zheng
[jira] Created: (HIVE-1175) Enable automatic patch testing on Hudson
Enable automatic patch testing on Hudson Key: HIVE-1175 URL: https://issues.apache.org/jira/browse/HIVE-1175 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach See http://developer.yahoo.net/blogs/hadoop/2007/12/if_it_hurts_automate_it_1.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1175) Enable automatic patch testing on Hudson
[ https://issues.apache.org/jira/browse/HIVE-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834651#action_12834651 ] Carl Steinbach commented on HIVE-1175: -- The test-patch.sh script lives here: http://svn.apache.org/repos/asf/hadoop/core/nightly/test-patch We will need to configure svn:externals in order to get this pull in as part of hive trunk. Something like this: Check out hive trunk cd hive-trunk export EDITOR=emacs svn propedit svn:externals testutils [ the above step will open up the emacs. Type in the following line and save it] test-patch http://svn.apache.org/repos/asf/hadoop/nightly/test-patch svn commit Enable automatic patch testing on Hudson Key: HIVE-1175 URL: https://issues.apache.org/jira/browse/HIVE-1175 Project: Hadoop Hive Issue Type: Task Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach See http://developer.yahoo.net/blogs/hadoop/2007/12/if_it_hurts_automate_it_1.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1117) Make QueryPlan serializable
[ https://issues.apache.org/jira/browse/HIVE-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1117: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) committed. Thanks Zheng Make QueryPlan serializable --- Key: HIVE-1117 URL: https://issues.apache.org/jira/browse/HIVE-1117 Project: Hadoop Hive Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Fix For: 0.6.0 Attachments: HIVE-1117.1.code.patch, HIVE-1117.1.test.patch We need to make QueryPlan serializable so that we can resume the query some time later. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1163) Eclipse launchtemplate changes to enable debugging
[ https://issues.apache.org/jira/browse/HIVE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834661#action_12834661 ] Ning Zhang commented on HIVE-1163: -- +1. Changes look good. JAVA_HOME works on Linux as well. Eclipse launchtemplate changes to enable debugging -- Key: HIVE-1163 URL: https://issues.apache.org/jira/browse/HIVE-1163 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.6.0 Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-1163.4.patch, HIVE-1163.patch, HIVE-1163_2.patch, HIVE-1163_3.patch Some recent changes in the build.xml and build-common.xml breaks the debugging functionality in eclipse. Some system defined properties were missing when running eclipse debugger. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (HIVE-1158) Introducing a new parameter for Map-side join bucket size
[ https://issues.apache.org/jira/browse/HIVE-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao reopened HIVE-1158: -- Need a patch for branch 0.5 Introducing a new parameter for Map-side join bucket size - Key: HIVE-1158 URL: https://issues.apache.org/jira/browse/HIVE-1158 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.5.0, 0.6.0 Reporter: Ning Zhang Assignee: Ning Zhang Attachments: HIVE-1158.patch Map-side join cache the small table in memory and join with the split of the large table at the mapper side. If the small table is too large, it uses RowContainer to cache a number of rows indicated by parameter hive.join.cache.size, whose default value is 25000. This parameter is also used for regular reducer-side joins to cache all input tables except the streaming table. This default value is too large for map-side join bucket size, resulting in OOM exceptions sometimes. We should define a different parameter to separate these two cache sizes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.