[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141457#comment-15141457 ] Hive QA commented on HIVE-1608: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12787126/HIVE-1608.5.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6934/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6934/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6934/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6934/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2663f49 HIVE-12987: Add metrics for HS2 active users and SQL operations(Jimmy, reviewed by Szehon, Aihua) + git clean -f -d Removing hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/AppConfig.java.orig + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 2663f49 HIVE-12987: Add metrics for HS2 active users and SQL operations(Jimmy, reviewed by Szehon, Aihua) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12787126 - PreCommit-HIVE-TRUNK-Build > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Chaoyu Tang > Fix For: 2.1.0 > > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.4.patch, HIVE-1608.5.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139432#comment-15139432 ] Chaoyu Tang commented on HIVE-1608: --- INSERT OVERWRITE [Local] DIRECTORY is actually not affected by this change, and I have had some tests and verified it. It is because Hive already uses the default tabledesc whose fileformat is hardcoded as "TextFile" for these cases. See related code: {code} SemanticAnalyzer.java -- line 6523: if (qb.getIsQuery()) { String fileFormat = HiveConf.getVar(conf, HiveConf.ConfVars.HIVEQUERYRESULTFILEFORMAT); table_desc = PlanUtils.getDefaultQueryOutputTableDesc(cols, colTypes, fileFormat); } else { table_desc = PlanUtils.getDefaultTableDesc(qb.getDirectoryDesc(), cols, colTypes); } --- PlanUtils.java -- 211, 224 public static TableDesc getDefaultTableDesc(String separatorCode, String columns, String columnTypes, boolean lastColumnTakesRestOfTheLine) { return getTableDesc(LazySimpleSerDe.class, separatorCode, columns, columnTypes, lastColumnTakesRestOfTheLine); } public static TableDesc getTableDesc( Class serdeClass, String separatorCode, String columns, String columnTypes, boolean lastColumnTakesRestOfTheLine, boolean useDelimitedJSON) { return getTableDesc(serdeClass, separatorCode, columns, columnTypes, lastColumnTakesRestOfTheLine, useDelimitedJSON, "TextFile"); } {code} > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.4.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139716#comment-15139716 ] Brock Noland commented on HIVE-1608: Thank you [~ctang.ma]! I came here to review so thanks to [~ashutoshc] as well. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Chaoyu Tang > Fix For: 2.1.0 > > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.4.patch, HIVE-1608.5.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139718#comment-15139718 ] Chaoyu Tang commented on HIVE-1608: --- Updated the wiki and documented SequenceFile as the new default value for hive.query.result.fileformat since Hive 2.1.0. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Chaoyu Tang > Fix For: 2.1.0 > > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.4.patch, HIVE-1608.5.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139442#comment-15139442 ] Ashutosh Chauhan commented on HIVE-1608: cool. +1 > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.4.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138909#comment-15138909 ] Hive QA commented on HIVE-1608: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12786826/HIVE-1608.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10024 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-unionDistinct_1.q-insert_values_non_partitioned.q-selectDistinctStar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_folder_predicate org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6918/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6918/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6918/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12786826 - PreCommit-HIVE-TRUNK-Build > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15139113#comment-15139113 ] Ashutosh Chauhan commented on HIVE-1608: As noted earlier in thread this will be an incompatible change for INSERT OVERWRITE DIRECTORY case. Seems like your patch doesnt handle that. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.4.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15137128#comment-15137128 ] Chaoyu Tang commented on HIVE-1608: --- [~brocknoland], [~ashutoshc] could you review the patch at your earliest convenience given that so many test files have been changed? Thanks in advanced. The change is straightforward: 1. code: change hive.query.result.fileformat default to use SequenceFilenew in order to default support new line character in column. 2. test output files: change all input and output format of FileSinkOperator to SequenceFileInputFormat and SequenceFileOutputFormat. Tests: 1. Some manual tests which also includes "insert overwrite [local] directory" that is actually not affected by this change 2. Precommit tests. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.3.patch, > HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136907#comment-15136907 ] Hive QA commented on HIVE-1608: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12786628/HIVE-1608.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10035 tests executed *Failed tests:* {noformat} TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_folder_predicate org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_explode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_explode org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucketpruning1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6904/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6904/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6904/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12786628 - PreCommit-HIVE-TRUNK-Build > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.2.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132819#comment-15132819 ] Chaoyu Tang commented on HIVE-1608: --- FYI, regenerating baselines of some failed tests is already under the way. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132774#comment-15132774 ] Aihua Xu commented on HIVE-1608: Seems it makes sense to switch to use SequenceFile by default. It will save the space. The newline characters in the intermediate text file are escaped, so we shouldn't have that issue any more. I will regenerate the baselines. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126189#comment-15126189 ] Hive QA commented on HIVE-1608: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12785276/HIVE-1608.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1143 failed/errored test(s), 10047 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join21 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join28 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binarysortable_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark4
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126424#comment-15126424 ] Chaoyu Tang commented on HIVE-1608: --- More than one thousand tests failed, but most of them need only update their output files to change the input format from org.apache.hadoop.mapred.TextInputFormat to and output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat and output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124980#comment-15124980 ] Aihua Xu commented on HIVE-1608: I will take a look as well. Didn't know this issue before. > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.1.patch, HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results
[ https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124358#comment-15124358 ] Brock Noland commented on HIVE-1608: I hit this again. [~aihuaxu] [~ctang.ma] would someone from your team be interested picking this one up? > use sequencefile as the default for storing intermediate results > > > Key: HIVE-1608 > URL: https://issues.apache.org/jira/browse/HIVE-1608 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.7.0 >Reporter: Namit Jain >Assignee: Brock Noland > Attachments: HIVE-1608.patch > > > The only argument for having a text file for storing intermediate results > seems to be better debuggability. > But, tailing a sequence file is possible, and it should be more space > efficient -- This message was sent by Atlassian JIRA (v6.3.4#6332)