[jira] [Resolved] (MAPREDUCE-5962) Support CRC32C in IFile
[ https://issues.apache.org/jira/browse/MAPREDUCE-5962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved MAPREDUCE-5962. Resolution: Won't Fix > Support CRC32C in IFile > --- > > Key: MAPREDUCE-5962 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5962 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: performance, task >Affects Versions: 2.5.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Attachments: mapreduce-5962.txt, mapreduce-5962.txt > > > Currently, the IFile format used by the MR shuffle checksums all data using > the zlib CRC32 polynomial. If we allow use of CRC32C instead, we can get a > large reduction in CPU usage by leveraging the native hardware CRC32C > implementation (approx half a second of CPU time savings per GB checksummed). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449035#comment-16449035 ] Sergey Shelukhin commented on MAPREDUCE-7086: - Will fix whitespace on next review iteration; or you can fix it on commit ;) > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch, MAPREDUCE-7086.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449014#comment-16449014 ] Hadoop QA commented on MAPREDUCE-7086: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 2 new + 87 unchanged - 0 fixed = 89 total (was 87) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 50s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | MAPREDUCE-7086 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12920348/MAPREDUCE-7086.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d1c5591ee489 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 42e82f0 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7396/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt | | whitespace |
[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448958#comment-16448958 ] Robert Kanter commented on MAPREDUCE-7072: -- Great investigation [~wilfreds]! The patch looks good overall, just one thing: # Instead of using the iterator: {code:java} Iterator groupNames = totalCounters.iterator(); while (groupNames.hasNext()) { String groupName = groupNames.next().getName(); {code} I think we can still do a for-each loop like this: {code:java} for (CounterGroup counterGroup : totalCounters) { String groupName = counterGroup.getName(); {code} > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated MAPREDUCE-7086: Attachment: MAPREDUCE-7086.patch > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch, MAPREDUCE-7086.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448933#comment-16448933 ] Sergey Shelukhin commented on MAPREDUCE-7086: - Added the config, off by default; addressed other feedback. > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch, MAPREDUCE-7086.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448926#comment-16448926 ] Wangda Tan commented on MAPREDUCE-7086: --- [~sershe], oh that's my bad, moving the JIRA somehow cleaned the assignee, apologize for that. Jason has assigned this to you. > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned MAPREDUCE-7086: - Assignee: Sergey Shelukhin > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448823#comment-16448823 ] Sergey Shelukhin commented on MAPREDUCE-7086: - Can someone assign this back to me? I will update the patch in due course > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448815#comment-16448815 ] Hadoop QA commented on MAPREDUCE-7086: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 56s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 54m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | MAPREDUCE-7086 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12920094/HADOOP-15403.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a98beaed3f38 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f411de6 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7395/testReport/ | | Max. process+thread count | 441 (vs. ulimit of 1) | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7395/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT
[jira] [Assigned] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned MAPREDUCE-7086: - Assignee: (was: Sergey Shelukhin) Key: MAPREDUCE-7086 (was: HADOOP-15403) Project: Hadoop Map/Reduce (was: Hadoop Common) > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7086) FileInputFormat recursive=false fails instead of ignoring the directories.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448750#comment-16448750 ] Wangda Tan commented on MAPREDUCE-7086: --- I just moved this project to MAPREDUCE. > FileInputFormat recursive=false fails instead of ignoring the directories. > -- > > Key: MAPREDUCE-7086 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7086 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HADOOP-15403.patch > > > We are trying to create a split in Hive that will only read files in a > directory and not subdirectories. > That fails with the below error. > Given how this error comes about (two pieces of code interact, one explicitly > adding directories to results without failing, and one failing on any > directories in results), this seems like a bug. > {noformat} > Caused by: java.io.IOException: Not a file: > file:/,...warehouse/simple_to_mm_text/delta_001_001_ > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:329) > ~[hadoop-mapreduce-client-core-3.1.0.jar:?] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:553) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:754) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:203) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > {noformat} > This code, when recursion is disabled, adds directories to results > {noformat} > if (recursive && stat.isDirectory()) { > result.dirsNeedingRecursiveCalls.add(stat); > } else { > result.locatedFileStatuses.add(stat); > } > {noformat} > However the getSplits code after that computes the size like this > {noformat} > long totalSize = 0; // compute total size > for (FileStatus file: files) {// check we have valid files > if (file.isDirectory()) { > throw new IOException("Not a file: "+ file.getPath()); > } > totalSize += > {noformat} > which would always fail combined with the above code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447715#comment-16447715 ] Hadoop QA commented on MAPREDUCE-7072: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 56s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 42s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 52m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | MAPREDUCE-7072 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12920239/MAPREDUCE-7072.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6e80395b4c1c 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 63803e7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7394/testReport/ | | Max. process+thread count | 411 (vs. ulimit of 1) | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7394/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > mapred job -history prints duplicate
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Status: Patch Available (was: Open) patch including tests: the json and human printer have both been changed to ignore the deprecated counters and keep code in sync > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Attachment: MAPREDUCE-7072.01.patch > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org