from:"lohit vijayarenu \(JIRA\)"

[jira] Created: (HADOOP-2647) dfs -put hangs

2008-01-17 Thread lohit vijayarenu (JIRA)

dfs -put hangs
--

 Key: HADOOP-2647
 URL: https://issues.apache.org/jira/browse/HADOOP-2647
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Affects Versions: 0.15.1
 Environment: LINUX
Reporter: lohit vijayarenu


We saw a case where dfs -put hung while copying a 2GB file for over 20 hours.
When we took a look at the stack trace of process the main thread was waiting 
for confirmation from namenode for complete status.
only 4 blocks were copied and 5th block was missing when we ran fsck on the 
partially transfered file. 

There are 2 problems we saw here.

1. DFS client hung without a timeout when there is no response from namenode.
2. In IOUtils::copyBytes(InputStream in, OutputStream out, int buffSize, 
boolean close)
During copy, if there is an exception, the out.close() is called. Exception is 
not caught. Which is why we see a close call in the stack trace. 

When we checked for block IDs in namenode log. For the block which was missing, 
there was only one response to namenode instead of three.
This close state coupled with namenode not reporting the error back might have 
cause the whole process to hang.

Opening this JIRA to see if we could add checks to the two problems mentioned 
above.


"main" prio=10 tid=0x0805a000 nid=0x5b53 waiting on condition 
[0xf7e64000..0xf7e65288]   java.lang.Thread.State: TIMED_WAITING (sleeping)
  at java.lang.Thread.sleep(Native Method) 
  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1751) 
 - locked <0x77d593a0> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream)  at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
  at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)  
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
  at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)  at 
org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
  at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:826)
  at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:114)
  at org.apache.hadoop.fs.FsShell.run(FsShell.java:1354)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)  at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
  at org.apache.hadoop.fs.FsShell.main(FsShell.java:1472)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-12 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558344#action_12558344
 ] 

lohit vijayarenu commented on HADOOP-2570:
--

i checked out trunk, applied this patch and ran 'ant test'
apart from these twoorg.apache.hadoop.hbase.TestMergeMeta   
org.apache.hadoop.hbase.TestMergeTable  all test passed.

> streaming jobs fail after HADOOP-2227
> -
>
> Key: HADOOP-2570
> URL: https://issues.apache.org/jira/browse/HADOOP-2570
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
>Assignee: Amareshwari Sri Ramadasu
>Priority: Blocker
> Fix For: 0.15.3
>
> Attachments: HADOOP-2570_1_20080112.patch, patch-2570.txt
>
>
> HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed 
> like this
> {code}
> File jobCacheDir = new File(currentDir.getParentFile().getParent(), "work");
> {code}
> We should change this to get it working. Referring to the changes made in 
> HADOOP-2227, I see that the APIs used in there to construct the path are not 
> public. And hard coding the path in streaming does not look good. thought?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2582) hadoop dfs -copyToLocal creates zero byte files, when source file does not exists

2008-01-11 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2582:
-

Status: Patch Available  (was: Open)

thanks raghu. making it PA

> hadoop dfs -copyToLocal creates zero byte files, when source file does not 
> exists 
> --
>
> Key: HADOOP-2582
> URL: https://issues.apache.org/jira/browse/HADOOP-2582
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
> Attachments: HADOOP_2582_1.patch, HADOOP_2582_2.patch
>
>
> hadoop dfs -copyToLocal with an no existing source file creates a zero byte 
> destination file. It should throw an error message indicating the source file 
> does not exists.
> {noformat}
> [lohit@ hadoop-trunk]$ hadoop dfs -get nosuchfile nosuchfile
> [lohit@ hadoop-trunk]$ ls -l nosuchfile 
> -rw-r--r--  1 lohit users 0 Jan 11 21:58 nosuchfile
> [lohit@ hadoop-trunk]$
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2582) hadoop dfs -copyToLocal creates zero byte files, when source file does not exists

2008-01-11 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2582:
-

Attachment: HADOOP_2582_2.patch

Thanks Raghu, I have attached another patch which fixes FileUtil. Now we catch 
both -get and -put errors. 

> hadoop dfs -copyToLocal creates zero byte files, when source file does not 
> exists 
> --
>
> Key: HADOOP-2582
> URL: https://issues.apache.org/jira/browse/HADOOP-2582
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
> Attachments: HADOOP_2582_1.patch, HADOOP_2582_2.patch
>
>
> hadoop dfs -copyToLocal with an no existing source file creates a zero byte 
> destination file. It should throw an error message indicating the source file 
> does not exists.
> {noformat}
> [lohit@ hadoop-trunk]$ hadoop dfs -get nosuchfile nosuchfile
> [lohit@ hadoop-trunk]$ ls -l nosuchfile 
> -rw-r--r--  1 lohit users 0 Jan 11 21:58 nosuchfile
> [lohit@ hadoop-trunk]$
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2582) hadoop dfs -copyToLocal creates zero byte files, when source file does not exists

2008-01-11 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2582:
-

Attachment: HADOOP_2582_1.patch

Attached is a simple patch which checks for existence of source before 
initiating copy. Have updated TestDFSShell test case to check for this 
condition as well.

> hadoop dfs -copyToLocal creates zero byte files, when source file does not 
> exists 
> --
>
> Key: HADOOP-2582
> URL: https://issues.apache.org/jira/browse/HADOOP-2582
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
> Attachments: HADOOP_2582_1.patch
>
>
> hadoop dfs -copyToLocal with an no existing source file creates a zero byte 
> destination file. It should throw an error message indicating the source file 
> does not exists.
> {noformat}
> [lohit@ hadoop-trunk]$ hadoop dfs -get nosuchfile nosuchfile
> [lohit@ hadoop-trunk]$ ls -l nosuchfile 
> -rw-r--r--  1 lohit users 0 Jan 11 21:58 nosuchfile
> [lohit@ hadoop-trunk]$
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-11 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558159#action_12558159
 ] 

lohit vijayarenu commented on HADOOP-2570:
--

testing the streaming job again. This patch solves the problem seen earlier. 
Thanks!

> streaming jobs fail after HADOOP-2227
> -
>
> Key: HADOOP-2570
> URL: https://issues.apache.org/jira/browse/HADOOP-2570
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
>Assignee: Amareshwari Sri Ramadasu
>Priority: Blocker
> Fix For: 0.15.3
>
> Attachments: HADOOP-2570_1_20080112.patch, patch-2570.txt
>
>
> HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed 
> like this
> {code}
> File jobCacheDir = new File(currentDir.getParentFile().getParent(), "work");
> {code}
> We should change this to get it working. Referring to the changes made in 
> HADOOP-2227, I see that the APIs used in there to construct the path are not 
> public. And hard coding the path in streaming does not look good. thought?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2582) hadoop dfs -copyToLocal creates zero byte files, when source file does not exists

2008-01-11 Thread lohit vijayarenu (JIRA)

hadoop dfs -copyToLocal creates zero byte files, when source file does not 
exists 
--

 Key: HADOOP-2582
 URL: https://issues.apache.org/jira/browse/HADOOP-2582
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Affects Versions: 0.15.2
Reporter: lohit vijayarenu


hadoop dfs -copyToLocal with an no existing source file creates a zero byte 
destination file. It should throw an error message indicating the source file 
does not exists.

{noformat}
[lohit@ hadoop-trunk]$ hadoop dfs -get nosuchfile nosuchfile
[lohit@ hadoop-trunk]$ ls -l nosuchfile 
-rw-r--r--  1 lohit users 0 Jan 11 21:58 nosuchfile
[lohit@ hadoop-trunk]$
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-10 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557909#action_12557909
 ] 

lohit vijayarenu commented on HADOOP-2570:
--

I tested this patch and it works for my earlier failing streaming job. Thanks!

> streaming jobs fail after HADOOP-2227
> -
>
> Key: HADOOP-2570
> URL: https://issues.apache.org/jira/browse/HADOOP-2570
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
>Assignee: Amareshwari Sri Ramadasu
>Priority: Blocker
> Fix For: 0.15.3
>
> Attachments: patch-2570.txt
>
>
> HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed 
> like this
> {code}
> File jobCacheDir = new File(currentDir.getParentFile().getParent(), "work");
> {code}
> We should change this to get it working. Referring to the changes made in 
> HADOOP-2227, I see that the APIs used in there to construct the path are not 
> public. And hard coding the path in streaming does not look good. thought?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2116) Job.local.dir to be exposed to tasks

2008-01-10 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557807#action_12557807
 ] 

lohit vijayarenu commented on HADOOP-2116:
--

Hi Amareshwari,

I tested this patch against trunk for resolution of HADOOP-2570. This solves 
the problem mentioned in HADOOP-2570. Should this patch be marked to go in 
0.15.3 ? 

Thanks,
Lohit

> Job.local.dir to be exposed to tasks
> 
>
> Key: HADOOP-2116
> URL: https://issues.apache.org/jira/browse/HADOOP-2116
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Affects Versions: 0.14.3
> Environment: All
>Reporter: Milind Bhandarkar
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-2116.txt, patch-2116.txt
>
>
> Currently, since all task cwds are created under a jobcache directory, users 
> that need a job-specific shared directory for use as scratch space, create 
> ../work. This is hacky, and will break when HADOOP-2115 is addressed. For 
> such jobs, hadoop mapred should expose job.local.dir via localized 
> configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-10 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557695#action_12557695
 ] 

lohit vijayarenu commented on HADOOP-2570:
--

the 2 places where jobcache dir was used in streaming was to 'chmod' the 
executable and to lookup this directory in PATH. Would be it OK to construct 
jobCacheDir as done in HADOOP-2227 ? 

> streaming jobs fail after HADOOP-2227
> -
>
> Key: HADOOP-2570
> URL: https://issues.apache.org/jira/browse/HADOOP-2570
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.15.2
>Reporter: lohit vijayarenu
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.15.3
>
>
> HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed 
> like this
> {code}
> File jobCacheDir = new File(currentDir.getParentFile().getParent(), "work");
> {code}
> We should change this to get it working. Referring to the changes made in 
> HADOOP-2227, I see that the APIs used in there to construct the path are not 
> public. And hard coding the path in streaming does not look good. thought?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-09 Thread lohit vijayarenu (JIRA)

streaming jobs fail after HADOOP-2227
-

 Key: HADOOP-2570
 URL: https://issues.apache.org/jira/browse/HADOOP-2570
 Project: Hadoop
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.15.2
Reporter: lohit vijayarenu
 Fix For: 0.15.3


HADOOP-2227 changes jobCacheDir. In streaming, jobCacheDir was constructed like 
this
{code}
File jobCacheDir = new File(currentDir.getParentFile().getParent(), "work");
{code}

We should change this to get it working. Referring to the changes made in 
HADOOP-2227, I see that the APIs used in there to construct the path are not 
public. And hard coding the path in streaming does not look good. thought?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2427) Cleanup of mapred.local.dir after maptask is complete

2007-12-13 Thread lohit vijayarenu (JIRA)

Cleanup of mapred.local.dir after maptask is complete
-

 Key: HADOOP-2427
 URL: https://issues.apache.org/jira/browse/HADOOP-2427
 Project: Hadoop
  Issue Type: Bug
  Components: mapred
Affects Versions: 0.15.1
Reporter: lohit vijayarenu


I see that after a map task is complete, its working directory 
(mapred.local.dir)/taskTracker/jobcache// is not deleted 
untill the job is complete. If map out files are stored in there, could this be 
created in different directory and the working directory cleaned up after map 
task is complete. One problem we are seeing is, if a map task creates files 
temporary files, they get accumulated and we may run out of disk space thus 
failing the job. Relying on the user to cleanup all temp files created is be 
error prone.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HADOOP-2361) hadoop version wrong in 0.15.1

2007-12-05 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu resolved HADOOP-2361.
--

Resolution: Invalid

> hadoop version wrong in 0.15.1
> --
>
> Key: HADOOP-2361
> URL: https://issues.apache.org/jira/browse/HADOOP-2361
> Project: Hadoop
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.15.1
>Reporter: lohit vijayarenu
>
> I downloaded 0.15.1 release, recompiled and executed ./bin/hadoop version. It 
> says 0.15.2-dev picking it from build.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2361) hadoop version wrong in 0.15.1

2007-12-05 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548875
 ] 

lohit vijayarenu commented on HADOOP-2361:
--

my bad, looks like i picked 0.15 branch instead of tag. closing this as invalid

> hadoop version wrong in 0.15.1
> --
>
> Key: HADOOP-2361
> URL: https://issues.apache.org/jira/browse/HADOOP-2361
> Project: Hadoop
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.15.1
>Reporter: lohit vijayarenu
>
> I downloaded 0.15.1 release, recompiled and executed ./bin/hadoop version. It 
> says 0.15.2-dev picking it from build.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2361) hadoop version wrong in 0.15.1

2007-12-05 Thread lohit vijayarenu (JIRA)

hadoop version wrong in 0.15.1
--

 Key: HADOOP-2361
 URL: https://issues.apache.org/jira/browse/HADOOP-2361
 Project: Hadoop
  Issue Type: Bug
  Components: build
Affects Versions: 0.15.1
Reporter: lohit vijayarenu


I downloaded 0.15.1 release, recompiled and executed ./bin/hadoop version. It 
says 0.15.2-dev picking it from build.xml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2302) Streaming should provide an option for numerical sort of keys

2007-11-28 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2302:
-

Issue Type: Improvement  (was: Bug)

>  Streaming should provide an option for numerical sort of keys
> --
>
> Key: HADOOP-2302
> URL: https://issues.apache.org/jira/browse/HADOOP-2302
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/streaming
>Reporter: lohit vijayarenu
>
> It would be good to have an option for numerical sort of keys for streaming. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2302) Streaming should provide an option for numerical sort of keys

2007-11-28 Thread lohit vijayarenu (JIRA)

 Streaming should provide an option for numerical sort of keys
--

 Key: HADOOP-2302
 URL: https://issues.apache.org/jira/browse/HADOOP-2302
 Project: Hadoop
  Issue Type: Bug
  Components: contrib/streaming
Reporter: lohit vijayarenu


It would be good to have an option for numerical sort of keys for streaming. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2193) dfs rm and rmr commands differ from POSIX standards

2007-11-19 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2193:
-

Attachment: HADOOP-2193-1.patch

Attached is a simple patch. Since fs.delete does not propagate 
FileNotFoundException, I have added a fs.exists() check even before we try to 
move to trash and attempt delete. updated testcase to verify this.

> dfs rm and rmr commands differ from POSIX standards
> ---
>
> Key: HADOOP-2193
> URL: https://issues.apache.org/jira/browse/HADOOP-2193
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.1, 0.16.0
>Reporter: Mukund Madhugiri
> Fix For: 0.16.0
>
> Attachments: HADOOP-2193-1.patch
>
>
> Assuming the dfs commands follow POSIX standards, there are some problems 
> with the DFS rm and rmr commands. I compared the DFS output with that of RHEL 
> 4u5:
> In both cases, if the file/directory does not exist, it will not give any 
> indication to the user.
> 1. rm a file/directory that does not exist:
> Linux: rm: cannot remove `testarea/two': No such file or directory
> DFS: rm: /testarea/two
> 2. rmr a file/directory that does not exist:
> Linux: rm: cannot remove `testarea/two': No such file or directory
> DFS: rm: /testarea/two

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2190) dfs ls and lsr commands differ from POSIX standards

2007-11-17 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2190:
-

Attachment: HADOOP-2190-2.patch

Another patch, which now calls listStatus instead of deprecated listPaths(Path) 
. updated test case and changes suggested by Mukund.

> dfs ls and lsr commands differ from POSIX standards
> ---
>
> Key: HADOOP-2190
> URL: https://issues.apache.org/jira/browse/HADOOP-2190
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.1, 0.16.0
>Reporter: Mukund Madhugiri
> Fix For: 0.16.0
>
> Attachments: HADOOP-2190-1.patch, HADOOP-2190-2.patch
>
>
> Assuming the dfs commands follow POSIX standards, there are some problems 
> with the DFS ls  and lsr commands. I compared the DFS output with that of 
> RHEL 4u5
> 1. ls a directory when there are no files/directories in that directory:
> Linux: No output
> DFS: Found 0 items
> 2. ls a file/directory that does not exist:
> Linux: ls: /doesnotexist: No such file or directory
> DFS: Found 0 items
> 3. lsr a directory that does not exist:
> Linux: ls: /doesnotexist: No such file or directory
> DFS: No output

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2190) dfs ls and lsr commands differ from POSIX standards

2007-11-16 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542979
 ] 

lohit vijayarenu commented on HADOOP-2190:
--

sure, if patch is approved I will make these changes while adding testcase.

> dfs ls and lsr commands differ from POSIX standards
> ---
>
> Key: HADOOP-2190
> URL: https://issues.apache.org/jira/browse/HADOOP-2190
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.1, 0.16.0
>Reporter: Mukund Madhugiri
> Fix For: 0.16.0
>
> Attachments: HADOOP-2190-1.patch
>
>
> Assuming the dfs commands follow POSIX standards, there are some problems 
> with the DFS ls  and lsr commands. I compared the DFS output with that of 
> RHEL 4u5
> 1. ls a directory when there are no files/directories in that directory:
> Linux: No output
> DFS: Found 0 items
> 2. ls a file/directory that does not exist:
> Linux: ls: /doesnotexist: No such file or directory
> DFS: Found 0 items
> 3. lsr a directory that does not exist:
> Linux: ls: /doesnotexist: No such file or directory
> DFS: No output

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2213) Job submission gets Job tracker still initializing message while Namenode is in safemode

2007-11-15 Thread lohit vijayarenu (JIRA)

Job submission gets Job tracker still initializing message while Namenode is in 
safemode


 Key: HADOOP-2213
 URL: https://issues.apache.org/jira/browse/HADOOP-2213
 Project: Hadoop
  Issue Type: Bug
  Components: dfs, mapred
Reporter: lohit vijayarenu
Priority: Minor


While namenode is in safemode, if a user submits a job they receive 'Job 
tracker still initializing' exception. It would be good, if an appropriate 
error message is thrown.

Job started: Thu Nov 15 23:15:39 UTC 2007
org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.mapred.JobTracker$IllegalStateException: Job tracker still 
initializing
at 
org.apache.hadoop.mapred.JobTracker.ensureRunning(JobTracker.java:1505)
at org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:1513)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

at org.apache.hadoop.ipc.Client.call(Client.java:482)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
at $Proxy1.getNewJobId(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.getNewJobId(Unknown Source)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:452)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
at org.apache.hadoop.examples.RandomWriter.run(RandomWriter.java:274)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.RandomWriter.main(RandomWriter.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2190) dfs ls and lsr commands differ from POSIX standards

2007-11-15 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2190:
-

Attachment: HADOOP-2190-1.patch

Attached  patch throws out error for ls, lsr, du, dus commands for non-existing 
files. There were 2 changes DistributedFileSyste.listPaths() was not returning 
null for non-existing files and  FileSystem.listPaths() was consuming null 
returned from underlying listPaths and returning zero length Path[] array. So, 
I have made changes at few places so that listPaths returns null when required. 
I ran the tests and they look good. If anyone could take a look at the changes 
and approve it, I will prepare test cases and update another patch. The issue 
reported was also seen on local filesystem. fixing the second issue also takes 
care of local filesystem.

Below is the output after the patch
{noformat}
ls/lsr command 
[ hadoop-trunk]$ hadoop dfs -ls empty
Found 0 items
[ hadoop-trunk]$ hadoop dfs -ls nofile
ls: Could not get listing for nofile
[ hadoop-trunk]$ hadoop dfs -lsr nofile
lsr: Could not get listing for nofile

du command
[ hadoop-trunk]$ hadoop dfs -du empty
Found 0 items
[ hadoop-trunk]$ hadoop dfs -du nofile
du: Could not get listing for nofile

[ hadoop-trunk]$ hadoop dfs -dus empty
empty   0
[ hadoop-trunk]$ hadoop dfs -dus nofile
dus: dus: No match: nofile
[ hadoop-trunk]$
{noformat}

> dfs ls and lsr commands differ from POSIX standards
> ---
>
> Key: HADOOP-2190
> URL: https://issues.apache.org/jira/browse/HADOOP-2190
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.1, 0.16.0
>Reporter: Mukund Madhugiri
> Fix For: 0.16.0
>
> Attachments: HADOOP-2190-1.patch
>
>
> Assuming the dfs commands follow POSIX standards, there are some problems 
> with the DFS ls  and lsr commands. I compared the DFS output with that of 
> RHEL 4u5
> 1. ls a directory when there are no files/directories in that directory:
> Linux: No output
> DFS: Found 0 items
> 2. ls a file/directory that does not exist:
> Linux: ls: /doesnotexist: No such file or directory
> DFS: Found 0 items
> 3. lsr a directory that does not exist:
> Linux: ls: /doesnotexist: No such file or directory
> DFS: No output

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2210) WebUI should also list current time

2007-11-15 Thread lohit vijayarenu (JIRA)

WebUI should also list current time
---

 Key: HADOOP-2210
 URL: https://issues.apache.org/jira/browse/HADOOP-2210
 Project: Hadoop
  Issue Type: Bug
Reporter: lohit vijayarenu
Priority: Minor


It would be good if WebUI also listed current time (on all pages). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1722) Make streaming to handle non-utf8 byte array

2007-11-13 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542266
 ] 

lohit vijayarenu commented on HADOOP-1722:
--

bq all other bytes (not characters!) including non-ascii and non-utf8 are 
passed literally through. Quoting is done on the stdin of the process and 
unquoting is done on the stdout of the process. This would make it very easy to 
write arbitrary binary values to the framework from streaming

+1

would it be good to have an option which translate it, preserving the current 
behavior. It would be easier for few map/reduce scripts for framework to 
translate it.

> Make streaming to handle non-utf8 byte array
> 
>
> Key: HADOOP-1722
> URL: https://issues.apache.org/jira/browse/HADOOP-1722
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/streaming
>Reporter: Runping Qi
>Assignee: Christopher Zimmerman
>
> Right now, the streaming framework expects the output sof the steam process 
> (mapper or reducer) are line 
> oriented UTF-8 text. This limit makes it impossible to use those programs 
> whose outputs may be non-UTF-8
>  (international encoding, or maybe even binary data). Streaming can overcome 
> this limit by introducing a simple
> encoding protocol. For example, it can allow the mapper/reducer to hexencode 
> its keys/values, 
> the framework decodes them in the Java side.
> This way, as long as the mapper/reducer executables follow this encoding 
> protocol, 
> they can output arabitary bytearray and the streaming framework can handle 
> them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2181) Input Split details for maps should be logged

2007-11-09 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2181:
-

Component/s: mapred

> Input Split details for maps should be logged
> -
>
> Key: HADOOP-2181
> URL: https://issues.apache.org/jira/browse/HADOOP-2181
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: lohit vijayarenu
>Priority: Minor
>
> It would be nice if Input split details are logged someplace. This might help 
> debugging failed map tasks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2181) Input Split details for maps should be logged

2007-11-09 Thread lohit vijayarenu (JIRA)

Input Split details for maps should be logged
-

 Key: HADOOP-2181
 URL: https://issues.apache.org/jira/browse/HADOOP-2181
 Project: Hadoop
  Issue Type: Improvement
Reporter: lohit vijayarenu
Priority: Minor


It would be nice if Input split details are logged someplace. This might help 
debugging failed map tasks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2151) FileSyste.globPaths does not validate the return list of Paths

2007-11-09 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2151:
-

Status: Patch Available  (was: Open)

Thanks for reviewing Raghu. Making this PA.

> FileSyste.globPaths does not validate the return list of Paths
> --
>
> Key: HADOOP-2151
> URL: https://issues.apache.org/jira/browse/HADOOP-2151
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.0, 0.14.3
>Reporter: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2151-2.patch, HADOOP-2151-3.patch, 
> HADOOP-2151.patch
>
>
> FileSystem.globPaths does not validate the return list of Paths.
> Here is an example. 
> Consider a directory structure like
> /user/foo/DIR1/FILE1
> /user/foo/DIR2
> now if we pass an input path like "/user/foo/*/FILE1" to 
> FileSystem.globPaths()
> It returns 2 entries as shown below
> /user/foo/DIR1/FILE1
> /user/foo/DIR2/FILE1
> Should globPaths validate this and return only valid Paths? This behavior was 
> caught in FileSystem.validateInput() where an IOException is thrown while 
> processing such a directory structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-11-07 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540775
 ] 

lohit vijayarenu commented on HADOOP-1952:
--

Hi Arun,
Yes, these were present in the testcases but never used by the actual streaming 
command in those test cases. The invalid option passed on were ignored by the 
previous streaming code. The patch now catches such invalid options, so I 
updated the depending test cases as well.

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: CatchInvalidInputFormat.patch, HADOOP-1952-1.patch, 
> HADOOP-1952-2.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2151) FileSyste.globPaths does not validate the return list of Paths

2007-11-07 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2151:
-

Attachment: HADOOP-2151-3.patch

Another update with testcase. 

> FileSyste.globPaths does not validate the return list of Paths
> --
>
> Key: HADOOP-2151
> URL: https://issues.apache.org/jira/browse/HADOOP-2151
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.14.3, 0.15.0
>Reporter: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2151-2.patch, HADOOP-2151-3.patch, 
> HADOOP-2151.patch
>
>
> FileSystem.globPaths does not validate the return list of Paths.
> Here is an example. 
> Consider a directory structure like
> /user/foo/DIR1/FILE1
> /user/foo/DIR2
> now if we pass an input path like "/user/foo/*/FILE1" to 
> FileSystem.globPaths()
> It returns 2 entries as shown below
> /user/foo/DIR1/FILE1
> /user/foo/DIR2/FILE1
> Should globPaths validate this and return only valid Paths? This behavior was 
> caught in FileSystem.validateInput() where an IOException is thrown while 
> processing such a directory structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2151) FileSyste.globPaths does not validate the return list of Paths

2007-11-06 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2151:
-

Attachment: HADOOP-2151-2.patch

Thanks Raghu. Here is an updated patch. And also got rid of ArrayList. We 
allocate new array only when we filter out parents.

> FileSyste.globPaths does not validate the return list of Paths
> --
>
> Key: HADOOP-2151
> URL: https://issues.apache.org/jira/browse/HADOOP-2151
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.14.3, 0.15.0
>Reporter: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2151-2.patch, HADOOP-2151.patch
>
>
> FileSystem.globPaths does not validate the return list of Paths.
> Here is an example. 
> Consider a directory structure like
> /user/foo/DIR1/FILE1
> /user/foo/DIR2
> now if we pass an input path like "/user/foo/*/FILE1" to 
> FileSystem.globPaths()
> It returns 2 entries as shown below
> /user/foo/DIR1/FILE1
> /user/foo/DIR2/FILE1
> Should globPaths validate this and return only valid Paths? This behavior was 
> caught in FileSystem.validateInput() where an IOException is thrown while 
> processing such a directory structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2151) FileSyste.globPaths does not validate the return list of Paths

2007-11-06 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2151:
-

Attachment: HADOOP-2151.patch

Attached is a patch which addresses this problem. globPaths basically checks if 
the path exits by calling exists() for only those paths which were expanded via 
listPaths in previous iteration. This is done by passing a new flag for the 
recursive function to indicating if previous component was glob or not.

> FileSyste.globPaths does not validate the return list of Paths
> --
>
> Key: HADOOP-2151
> URL: https://issues.apache.org/jira/browse/HADOOP-2151
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.14.3, 0.15.0
>Reporter: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2151.patch
>
>
> FileSystem.globPaths does not validate the return list of Paths.
> Here is an example. 
> Consider a directory structure like
> /user/foo/DIR1/FILE1
> /user/foo/DIR2
> now if we pass an input path like "/user/foo/*/FILE1" to 
> FileSystem.globPaths()
> It returns 2 entries as shown below
> /user/foo/DIR1/FILE1
> /user/foo/DIR2/FILE1
> Should globPaths validate this and return only valid Paths? This behavior was 
> caught in FileSystem.validateInput() where an IOException is thrown while 
> processing such a directory structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2151) FileSyste.globPaths does not validate the return list of Paths

2007-11-05 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2151:
-

Fix Version/s: 0.16.0

> FileSyste.globPaths does not validate the return list of Paths
> --
>
> Key: HADOOP-2151
> URL: https://issues.apache.org/jira/browse/HADOOP-2151
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.14.3, 0.15.0
>Reporter: lohit vijayarenu
> Fix For: 0.16.0
>
>
> FileSystem.globPaths does not validate the return list of Paths.
> Here is an example. 
> Consider a directory structure like
> /user/foo/DIR1/FILE1
> /user/foo/DIR2
> now if we pass an input path like "/user/foo/*/FILE1" to 
> FileSystem.globPaths()
> It returns 2 entries as shown below
> /user/foo/DIR1/FILE1
> /user/foo/DIR2/FILE1
> Should globPaths validate this and return only valid Paths? This behavior was 
> caught in FileSystem.validateInput() where an IOException is thrown while 
> processing such a directory structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2151) FileSyste.globPaths does not validate the return list of Paths

2007-11-05 Thread lohit vijayarenu (JIRA)

FileSyste.globPaths does not validate the return list of Paths
--

 Key: HADOOP-2151
 URL: https://issues.apache.org/jira/browse/HADOOP-2151
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Affects Versions: 0.15.0, 0.14.3
Reporter: lohit vijayarenu


FileSystem.globPaths does not validate the return list of Paths.

Here is an example. 
Consider a directory structure like
/user/foo/DIR1/FILE1
/user/foo/DIR2

now if we pass an input path like "/user/foo/*/FILE1" to FileSystem.globPaths()
It returns 2 entries as shown below
/user/foo/DIR1/FILE1
/user/foo/DIR2/FILE1

Should globPaths validate this and return only valid Paths? This behavior was 
caught in FileSystem.validateInput() where an IOException is thrown while 
processing such a directory structure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1281) Speculative map tasks aren't getting killed although the TIP completed

2007-11-02 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539748
 ] 

lohit vijayarenu commented on HADOOP-1281:
--

We hit this bug today.
Below is the log for 2 attempts for same task

Task Attempts Status Progress Start Time Finish Time Errors Task Logs Counters 
task_200711022153_0001_m_001548_0 SUCCEEDED 100.00% 2-Nov-2007 22:00:59 
2-Nov-2007 22:05:50 (4mins, 51sec)  
task_200711022153_0001_m_001548_1  KILLED 84.44% 2-Nov-2007 22:02:17 2-Nov-2007 
22:26:02 (23mins, 45sec)  


If you look at the time each of the attempt took, after the first attempt 
finished in ~4mins, the second attempt should have been killed. But it went 
ahead and was running for ~23min. When we took a look at the logs, we saw that, 
the attempt was issued a kill signal after the whole job was completed. 
The JobTracker did not send Kill signal to this task attempt (Or may be nothing 
was logged). 



> Speculative map tasks aren't getting killed although the TIP completed
> --
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>
> The speculative map tasks run to completion although the TIP succeeded since 
> the other task completed elsewhere.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-31 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Attachment: HADOOP-2071-5.patch

looks like unrelated contrib test failed. But there were 2 findbugs warning 
which I have fixed and attaching new patch

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2071-1.patch, HADOOP-2071-2.patch, 
> HADOOP-2071-3.patch, HADOOP-2071-4.patch, HADOOP-2071-5.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2120) dfs -getMerge does not do what it says it does

2007-10-31 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539170
 ] 

lohit vijayarenu commented on HADOOP-2120:
--

Visualizing this as a map-reduce job which actually merge/sort into a single 
file, shouldn't it be available as  a separate package (like distcp, may be)?
 This feature of merging files would be very useful for users who would like to 
have only one output file. For now they would want to stick to a single reducer 
and do not want to submit a job with multiple reducers (even thought that is 
better machine utilization). A generic merge utility with understands the 
format and merges would be useful? Something motivated from 
https://issues.apache.org/jira/browse/HADOOP-2113

> dfs -getMerge does not do what it says it does
> --
>
> Key: HADOOP-2120
> URL: https://issues.apache.org/jira/browse/HADOOP-2120
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.14.3
> Environment: All
>Reporter: Milind Bhandarkar
> Fix For: 0.16.0
>
>
> dfs -getMerge, which calls FileUtil.CopyMerge, contains this javadoc:
> {code}
> Get all the files in the directories that match the source file pattern
>* and merge and sort them to only one file on local fs 
>* srcf is kept.
> {code}
> However, it only concatenates the set of input files, rather than merging 
> them in sorted order.
> Ideally, the copyMerge should be equivalent to a map-reduce job with 
> IdentityMapper and IdentityReducer with numReducers = 1. However, not having 
> to run this as a map-reduce job has some advantages, since it increases 
> cluster utilization during reduce phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-31 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Status: Patch Available  (was: Open)

Making this PA

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2071-1.patch, HADOOP-2071-2.patch, 
> HADOOP-2071-3.patch, HADOOP-2071-4.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-31 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Attachment: HADOOP-2071-4.patch

Getting rid of an extra blank line in the patch.

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2071-1.patch, HADOOP-2071-2.patch, 
> HADOOP-2071-3.patch, HADOOP-2071-4.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-30 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1952:
-

Status: Patch Available  (was: Open)

Thanks Arun. Making this PA

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: CatchInvalidInputFormat.patch, HADOOP-1952-1.patch, 
> HADOOP-1952-2.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-30 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1952:
-

Attachment: HADOOP-1952-2.patch

On the other hand if I think about it, should we log at info level inside 
goodClassOrNull ? If the mapper is 'cat' which is valid executable, we should 
not log saying cat class not found, no? I reverting back the loggin inside 
goodClassOrNull and handle the failures in StreamJob where needed. Thoughts? 
(Attached is a patch, which reverts only login changes)

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: CatchInvalidInputFormat.patch, HADOOP-1952-1.patch, 
> HADOOP-1952-2.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-30 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Attachment: HADOOP-2071-3.patch

BufferedInputStream does not provide a way to get the current position in the 
stream and updating the encapsulated FSDataInputStream again is like seek back. 
So I have a the position stored in pos_ and update it accordingly. Attaching a 
new patch with this change and testcase. Please could anyone take a look.
Thanks!

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Fix For: 0.16.0
>
> Attachments: HADOOP-2071-1.patch, HADOOP-2071-2.patch, 
> HADOOP-2071-3.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2089) Multiple caheArchive does not work in Hadoop streaming

2007-10-26 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2089:
-

Status: Patch Available  (was: Open)

Making it patch available

> Multiple caheArchive does not work in Hadoop streaming
> --
>
> Key: HADOOP-2089
> URL: https://issues.apache.org/jira/browse/HADOOP-2089
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
> Environment: All
>Reporter: Milind Bhandarkar
>Assignee: lohit vijayarenu
>Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-2089-1.patch, HADOOP-2089-2.patch, 
> HADOOP-2089-3.patch
>
>
> Multiple -cacheArchive options in Hadoop streaming does not work. Here is the 
> stack trace:
> Exception in thread "main" java.lang.RuntimeException: 
> at org.apache.hadoop.streaming.StreamJob.fail(StreamJob.java:528)
> at 
> org.apache.hadoop.streaming.StreamJob.exitUsage(StreamJob.java:469)
> at 
> org.apache.hadoop.streaming.StreamJob.parseArgv(StreamJob.java:203)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:105)
> at 
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2089) Multiple caheArchive does not work in Hadoop streaming

2007-10-26 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2089:
-

Attachment: HADOOP-2089-3.patch

Thanks Devaraj. I fixed this and attaching an updated patch.

> Multiple caheArchive does not work in Hadoop streaming
> --
>
> Key: HADOOP-2089
> URL: https://issues.apache.org/jira/browse/HADOOP-2089
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
> Environment: All
>Reporter: Milind Bhandarkar
>Assignee: lohit vijayarenu
>Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-2089-1.patch, HADOOP-2089-2.patch, 
> HADOOP-2089-3.patch
>
>
> Multiple -cacheArchive options in Hadoop streaming does not work. Here is the 
> stack trace:
> Exception in thread "main" java.lang.RuntimeException: 
> at org.apache.hadoop.streaming.StreamJob.fail(StreamJob.java:528)
> at 
> org.apache.hadoop.streaming.StreamJob.exitUsage(StreamJob.java:469)
> at 
> org.apache.hadoop.streaming.StreamJob.parseArgv(StreamJob.java:203)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:105)
> at 
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-26 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1952:
-

Attachment: HADOOP-1952-1.patch

I am attaching an updated patch, which logs at info level inside 
goodClassOrNull method and fails for invalid class in 
partitioner/combiner/output format/inputformat. Fixed  streaming test cases 
which were using invalid combiner which again was ignored previously. 

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: CatchInvalidInputFormat.patch, HADOOP-1952-1.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-25 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537838
 ] 

lohit vijayarenu commented on HADOOP-1952:
--

Ok, I should have mentioned that StreamUtil.goodClassOrNull actually is used to 
check if the map command passed is a class or an executable. So, -mapper could 
have either a class or just an executable. So, if that is not a valid class, a 
null (as the method name indicates) is returned and StreamJob treats it as a 
map executable. How about logging at info level inside the goodClassOrNull 
class and instead of throwing an exception we fail at appropriate places, when 
we do not allow null? If so, I will modify that and submit a patch.

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: CatchInvalidInputFormat.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-25 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Status: Open  (was: Patch Available)

Thanks Raghu. I will look into this case and resubmit new one. Canceling the 
patch.

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Fix For: 0.15.0
>
> Attachments: HADOOP-2071-1.patch, HADOOP-2071-2.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2101) JobTracker Startup failed with java.net.BindException

2007-10-25 Thread lohit vijayarenu (JIRA)

JobTracker Startup failed with java.net.BindException
-

 Key: HADOOP-2101
 URL: https://issues.apache.org/jira/browse/HADOOP-2101
 Project: Hadoop
  Issue Type: Bug
  Components: mapred
Affects Versions: 0.14.3
Reporter: lohit vijayarenu


We have seen one case where JobTracker failed with IOException but later 
retries of startup fails with BindException going into loop before failing.
Here is the stacktrace.


2007-10-23 05:51:19,374 WARN org.apache.hadoop.mapred.JobTracker: Error 
starting tracker: java.io.IOException: Problem starting http server
at 
org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:202)
at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:659)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:108)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:1788)
Caused by: org.mortbay.util.MultiException[java.net.BindException: Address 
already in use]
at org.mortbay.http.HttpServer.doStart(HttpServer.java:731)
at org.mortbay.util.Container.start(Container.java:72)
at 
org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:179)
... 3 more

2007-10-23 05:51:20,421 WARN org.apache.hadoop.mapred.JobTracker: Error 
starting tracker: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:185)
at org.apache.hadoop.ipc.Server.(Server.java:627)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:324)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:294)
at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:647)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:108)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:1788)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-25 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537658
 ] 

lohit vijayarenu commented on HADOOP-1952:
--

Thanks for looking into this Arun.
The use case here was when a user specifies a partition/combiner class and if 
we discover the class is not available, the earlier code used to just ignore 
it. So, I added the message to let them know that their specified class does 
not exists and we are defaulting. Should I move it to DEBUG and resubmit a 
patch?

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Attachments: CatchInvalidInputFormat.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2089) Multiple caheArchive does not work in Hadoop streaming

2007-10-25 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2089:
-

Attachment: HADOOP-2089-2.patch

Updating the patch with a test case.

> Multiple caheArchive does not work in Hadoop streaming
> --
>
> Key: HADOOP-2089
> URL: https://issues.apache.org/jira/browse/HADOOP-2089
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
> Environment: All
>Reporter: Milind Bhandarkar
>Assignee: lohit vijayarenu
>Priority: Critical
> Attachments: HADOOP-2089-1.patch, HADOOP-2089-2.patch
>
>
> Multiple -cacheArchive options in Hadoop streaming does not work. Here is the 
> stack trace:
> Exception in thread "main" java.lang.RuntimeException: 
> at org.apache.hadoop.streaming.StreamJob.fail(StreamJob.java:528)
> at 
> org.apache.hadoop.streaming.StreamJob.exitUsage(StreamJob.java:469)
> at 
> org.apache.hadoop.streaming.StreamJob.parseArgv(StreamJob.java:203)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:105)
> at 
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2089) Multiple caheArchive does not work in Hadoop streaming

2007-10-24 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2089:
-

Attachment: HADOOP-2089-1.patch

Small Fix to properly parse command line option

> Multiple caheArchive does not work in Hadoop streaming
> --
>
> Key: HADOOP-2089
> URL: https://issues.apache.org/jira/browse/HADOOP-2089
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
> Environment: All
>Reporter: Milind Bhandarkar
>Assignee: lohit vijayarenu
>Priority: Critical
> Attachments: HADOOP-2089-1.patch
>
>
> Multiple -cacheArchive options in Hadoop streaming does not work. Here is the 
> stack trace:
> Exception in thread "main" java.lang.RuntimeException: 
> at org.apache.hadoop.streaming.StreamJob.fail(StreamJob.java:528)
> at 
> org.apache.hadoop.streaming.StreamJob.exitUsage(StreamJob.java:469)
> at 
> org.apache.hadoop.streaming.StreamJob.parseArgv(StreamJob.java:203)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:105)
> at 
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2066) filenames with ':' colon throws java.lang.IllegalArgumentException

2007-10-22 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536716
 ] 

lohit vijayarenu commented on HADOOP-2066:
--

Hi Nicholas, 

I just tried this patch against trunk.
Am I doing this right?

[lohit@ hadoop-trunk]$ hadoop dfs -put ./abcd:efgh.tar.gz "abcd\:efgh.tar.gz"
put: Pathname /user/lohit/abcd\:efgh.tar.gz from abcd\:efgh.tar.gz is not a 
valid DFS filename.
[lohit@ hadoop-trunk]$ hadoop dfs -put ./abcd:efgh.tar.gz "abcd%3aefgh.tar.gz"
[lohit@ hadoop-trunk]$ hadoop dfs -ls 
Found 3 items
/user/lohit/abcd   5   2007-10-22 15:38
/user/lohit/abcd%3aefgh.tar.gz 5   2007-10-22 15:39
/user/lohit/test.x 61646   2007-10-22 15:38
[EMAIL PROTECTED] hadoop-trunk]$ 

Thanks!

> filenames with ':' colon throws java.lang.IllegalArgumentException
> --
>
> Key: HADOOP-2066
> URL: https://issues.apache.org/jira/browse/HADOOP-2066
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Reporter: lohit vijayarenu
> Attachments: 2066_20071019.patch, HADOOP-2066.patch
>
>
> File names containing colon ":" throws  java.lang.IllegalArgumentException 
> while LINUX file system supports it.
> $ hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz filenametest
> Exception in thread "main" java.lang.IllegalArgumentException: 
> java.net.URISyntaxException: Relative path in absolute
> URI: testfile-2007-09-24-03:00:00.gz
>   at org.apache.hadoop.fs.Path.initialize(Path.java:140)
>   at org.apache.hadoop.fs.Path.(Path.java:126)
>   at org.apache.hadoop.fs.Path.(Path.java:50)
>   at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:273)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
>   at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
>   at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
>   at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
>   at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
>   at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
>   at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> testfile-2007-09-24-03:00:00.gz
>   at java.net.URI.checkPath(URI.java:1787)
>   at java.net.URI.(URI.java:735)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:137)
>   ... 10 more
> Path(String pathString) when given a filename which contains ':' treats it as 
> URI and selects anything before ':' as
> scheme, which in this case is clearly not a valid scheme.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-20 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1952:
-

Status: Patch Available  (was: Open)

Making this patch available. 

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Attachments: CatchInvalidInputFormat.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-10-20 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1952:
-

Assignee: lohit vijayarenu

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
>Priority: Minor
> Attachments: CatchInvalidInputFormat.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-20 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Attachment: HADOOP-2071-2.patch

With inputs from Raghu and Milind, here is an updated patch. This wraps 
FSDataInputStream around BufferedInputStream and eliminates seek(). Patch also 
includes a simple test case for StreamXmlRecordReader. 

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Attachments: HADOOP-2071-1.patch, HADOOP-2071-2.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2066) filenames with ':' colon throws java.lang.IllegalArgumentException

2007-10-18 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535986
 ] 

lohit vijayarenu commented on HADOOP-2066:
--

How would be the usage with this patch?
I tried these

[EMAIL PROTECTED] hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz  
"testfile-2007-09-24-03%3a00%3a00.gz"
put: Pathname /user/lohit/testfile-2007-09-24-03:00:00.gz from 
testfile-2007-09-24-03:00:00.gz is not a valid DFS filename.
[EMAIL PROTECTED] echo "TEST" > testfile
[EMAIL PROTECTED]  hadoop dfs -put ./testfile 
"testfile-2007-09-24-03%3a00%3a00.gz"
put: Pathname /user/lohit/testfile-2007-09-24-03:00:00.gz from 
testfile-2007-09-24-03:00:00.gz is not a valid DFS filename.


> filenames with ':' colon throws java.lang.IllegalArgumentException
> --
>
> Key: HADOOP-2066
> URL: https://issues.apache.org/jira/browse/HADOOP-2066
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Reporter: lohit vijayarenu
> Attachments: HADOOP-2066.patch
>
>
> File names containing colon ":" throws  java.lang.IllegalArgumentException 
> while LINUX file system supports it.
> $ hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz filenametest
> Exception in thread "main" java.lang.IllegalArgumentException: 
> java.net.URISyntaxException: Relative path in absolute
> URI: testfile-2007-09-24-03:00:00.gz
>   at org.apache.hadoop.fs.Path.initialize(Path.java:140)
>   at org.apache.hadoop.fs.Path.(Path.java:126)
>   at org.apache.hadoop.fs.Path.(Path.java:50)
>   at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:273)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
>   at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
>   at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
>   at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
>   at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
>   at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
>   at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> testfile-2007-09-24-03:00:00.gz
>   at java.net.URI.checkPath(URI.java:1787)
>   at java.net.URI.(URI.java:735)
>   at org.apache.hadoop.fs.Path.initialize(Path.java:137)
>   ... 10 more
> Path(String pathString) when given a filename which contains ':' treats it as 
> URI and selects anything before ':' as
> scheme, which in this case is clearly not a valid scheme.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-17 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Attachment: HADOOP-2071-1.patch

Attached is a patch, which eliminates mark/reset. 
At one place seek() was called even after reset() which made it redundant.
Please could anyone review this.
Thanks

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
> Attachments: HADOOP-2071-1.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-17 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2071:
-

Assignee: lohit vijayarenu

> StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
> hadoop 0.14
> -
>
> Key: HADOOP-2071
> URL: https://issues.apache.org/jira/browse/HADOOP-2071
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.3
>Reporter: lohit vijayarenu
>Assignee: lohit vijayarenu
> Attachments: HADOOP-2071-1.patch
>
>
> In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
> throw 
> java.io.IOException: Mark/reset exception in hadoop 0.14
> This looks to be related to 
> (https://issues.apache.org/jira/browse/HADOOP-2067).
> 
> Caused by: java.io.IOException: Mark/reset not supported
>   at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
>   at java.io.FilterInputStream.reset(FilterInputStream.java:200)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
> mlRecordReader.java:289)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
> XmlRecordReader.java:118)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
> eamXmlRecordReader.java:111)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
> .java:73)
>   at
> org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
> a:63)
> 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2071) StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in hadoop 0.14

2007-10-17 Thread lohit vijayarenu (JIRA)

StreamXmlRecordReader throws java.io.IOException: Mark/reset exception in 
hadoop 0.14
-

 Key: HADOOP-2071
 URL: https://issues.apache.org/jira/browse/HADOOP-2071
 Project: Hadoop
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.14.3
Reporter: lohit vijayarenu


In hadoop 0.14, using -inputreader StreamXmlRecordReader  for streaming jobs 
throw 
java.io.IOException: Mark/reset exception in hadoop 0.14
This looks to be related to (https://issues.apache.org/jira/browse/HADOOP-2067).


Caused by: java.io.IOException: Mark/reset not supported
at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.reset(DFSClient.java:1353)
at java.io.FilterInputStream.reset(FilterInputStream.java:200)
at
org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamX
mlRecordReader.java:289)
at
org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchBegin(Stream
XmlRecordReader.java:118)
at
org.apache.hadoop.streaming.StreamXmlRecordReader.seekNextRecordBoundary(Str
eamXmlRecordReader.java:111)
at
org.apache.hadoop.streaming.StreamXmlRecordReader.init(StreamXmlRecordReader
.java:73)
at
org.apache.hadoop.streaming.StreamXmlRecordReader.(StreamXmlRecordReader.jav
a:63)




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2066) filenames with ':' colon throws java.lang.IllegalArgumentException

2007-10-17 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2066:
-

Description: 
File names containing colon ":" throws  java.lang.IllegalArgumentException 
while LINUX file system supports it.

$ hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz filenametest
Exception in thread "main" java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute
URI: testfile-2007-09-24-03:00:00.gz
at org.apache.hadoop.fs.Path.initialize(Path.java:140)
at org.apache.hadoop.fs.Path.(Path.java:126)
at org.apache.hadoop.fs.Path.(Path.java:50)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:273)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
testfile-2007-09-24-03:00:00.gz
at java.net.URI.checkPath(URI.java:1787)
at java.net.URI.(URI.java:735)
at org.apache.hadoop.fs.Path.initialize(Path.java:137)
... 10 more

Path(String pathString) when given a filename which contains ':' treats it as 
URI and selects anything before ':' as
scheme, which in this case is clearly not a valid scheme.

  was:
File names containing colon ":" throws  java.lang.IllegalArgumentException 
while LINUX file system supports it.

[EMAIL PROTECTED] ~]$ hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz 
filenametest
Exception in thread "main" java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute
URI: testfile-2007-09-24-03:00:00.gz
at org.apache.hadoop.fs.Path.initialize(Path.java:140)
at org.apache.hadoop.fs.Path.(Path.java:126)
at org.apache.hadoop.fs.Path.(Path.java:50)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:273)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
testfile-2007-09-24-03:00:00.gz
at java.net.URI.checkPath(URI.java:1787)
at java.net.URI.(URI.java:735)
at org.apache.hadoop.fs.Path.initialize(Path.java:137)
... 10 more
[EMAIL PROTECTED] ~]$ 

Path(String pathString) when given a filename which contains ':' treats it as 
URI and selects anything before ':' as
scheme, which in this case is clearly not a valid scheme.


> filenames with ':' colon throws java.lang.IllegalArgumentException
> --
>
> Key: HADOOP-2066
> URL: https://issues.apache.org/jira/browse/HADOOP-2066
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Reporter: lohit vijayarenu
> Attachments: HADOOP-2066.patch
>
>
> File names containing colon ":" throws  java.lang.IllegalArgumentException 
> while LINUX file system supports it.
> $ hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz filenametest
> Exception in thread "main" java.lang.IllegalArgumentException: 
> java.net.URISyntaxException: Relative path in absolute
> URI: testfile-2007-09-24-03:00:00.gz
>   at org.apache.hadoop.fs.Path.initialize(Path.java:140)
>   at org.apache.hadoop.fs.Path.(Path.java:126)
>   at org.apache.hadoop.fs.Path.(Path.java:50)
>   at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:273)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
>   at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
>   at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
>   at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
>   at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
>   at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
>   at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> testfile-2007-09-24-03:00:00.gz
>   at java.net.URI.checkPath(URI.java:1787)
>   at j

[jira] Updated: (HADOOP-2067) multiple close() failing in Hadoop 0.14

2007-10-16 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-2067:
-

Attachment: stack_trace_13_and_14.txt

> multiple close() failing in Hadoop 0.14
> ---
>
> Key: HADOOP-2067
> URL: https://issues.apache.org/jira/browse/HADOOP-2067
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Reporter: lohit vijayarenu
> Attachments: stack_trace_13_and_14.txt
>
>
> It looks like multiple close() calls, while reading files from DFS is failing 
> in hadoop 0.14. This was somehow not caught in hadoop 0.13.
> The use case was to open a file on DFS like shown below
> 
>  FSDataInputStream
>   fSDataInputStream =
>   fileSystem.open(new Path(propertyFileName));
>   Properties subProperties =
>   new Properties();
>   subProperties.
>   loadFromXML(fSDataInputStream);
>   fSDataInputStream.
>   close();
> 
> This failed with an IOException
> 
> EXCEPTIN RAISED, which is java.io.IOException: Stream closed
> java.io.IOException: Stream closed
> 
> The stack trace shows its being closed twice. While this used to work in 
> hadoop 0.13 which used to hide this.
> Attached with this JIRA is a text file which has stack trace for both hadoop 
> 0.13 and hadoop 0.14.
> How should this be handled from a users point of view? 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2067) multiple close() failing in Hadoop 0.14

2007-10-16 Thread lohit vijayarenu (JIRA)

multiple close() failing in Hadoop 0.14
---

 Key: HADOOP-2067
 URL: https://issues.apache.org/jira/browse/HADOOP-2067
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Reporter: lohit vijayarenu


It looks like multiple close() calls, while reading files from DFS is failing 
in hadoop 0.14. This was somehow not caught in hadoop 0.13.
The use case was to open a file on DFS like shown below

 FSDataInputStream
fSDataInputStream =
fileSystem.open(new Path(propertyFileName));
  Properties subProperties =
new Properties();
  subProperties.
loadFromXML(fSDataInputStream);
  fSDataInputStream.
close();


This failed with an IOException

EXCEPTIN RAISED, which is java.io.IOException: Stream closed
java.io.IOException: Stream closed


The stack trace shows its being closed twice. While this used to work in hadoop 
0.13 which used to hide this.
Attached with this JIRA is a text file which has stack trace for both hadoop 
0.13 and hadoop 0.14.

How should this be handled from a users point of view? 

Thanks


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2066) filenames with ':' colon throws java.lang.IllegalArgumentException

2007-10-16 Thread lohit vijayarenu (JIRA)

filenames with ':' colon throws java.lang.IllegalArgumentException
--

 Key: HADOOP-2066
 URL: https://issues.apache.org/jira/browse/HADOOP-2066
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Reporter: lohit vijayarenu


File names containing colon ":" throws  java.lang.IllegalArgumentException 
while LINUX file system supports it.

[EMAIL PROTECTED] ~]$ hadoop dfs -put ./testfile-2007-09-24-03:00:00.gz 
filenametest
Exception in thread "main" java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute
URI: testfile-2007-09-24-03:00:00.gz
at org.apache.hadoop.fs.Path.initialize(Path.java:140)
at org.apache.hadoop.fs.Path.(Path.java:126)
at org.apache.hadoop.fs.Path.(Path.java:50)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:273)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
testfile-2007-09-24-03:00:00.gz
at java.net.URI.checkPath(URI.java:1787)
at java.net.URI.(URI.java:735)
at org.apache.hadoop.fs.Path.initialize(Path.java:137)
... 10 more
[EMAIL PROTECTED] ~]$ 

Path(String pathString) when given a filename which contains ':' treats it as 
URI and selects anything before ':' as
scheme, which in this case is clearly not a valid scheme.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-2053) OutOfMemoryError : Java heap space errors in hadoop 0.14

2007-10-13 Thread lohit vijayarenu (JIRA)

OutOfMemoryError : Java heap space errors in hadoop 0.14


 Key: HADOOP-2053
 URL: https://issues.apache.org/jira/browse/HADOOP-2053
 Project: Hadoop
  Issue Type: Bug
  Components: mapred
Affects Versions: 0.14.1
Reporter: lohit vijayarenu
 Fix For: 0.15.0


In recent hadoop 0.14 we are seeing few jobs where map taskf fail with 
java.lang.OutOfMemoryError: Java heap space problem
These were the same jobs which used to work fine with 0.13


task_200710112103_0001_m_15_1: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.Text.write(Text.java:243)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:340)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1771) streaming hang when IOException in MROutputThread. (NPE)

2007-10-12 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1771:
-

Status: Patch Available  (was: Open)

> streaming hang when IOException in MROutputThread. (NPE)
> 
>
> Key: HADOOP-1771
> URL: https://issues.apache.org/jira/browse/HADOOP-1771
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: Koji Noguchi
>Assignee: lohit vijayarenu
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: h-1771-2.patch, h-1771-3.patch, h-1771-3.patch, 
> h-1771-3.patch, h-1771-3.patch2, h-1771.patch
>
>
> One streaming task hang and had stderr userlog as follows.
> {code}
> Exception in thread "Thread-5" java.lang.NullPointerException
>  at java.lang.Throwable.printStackTrace(Throwable.java:460)
>  at 
> org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:352)
> {code}
> In PipeMapRed.java
> {code}
> 351   } catch (IOException io) {
> 352 io.printStackTrace(log_);
> 353 outerrThreadsThrowable = io;
> {code}
> I guess log_ is  Null... Should call logStackTrace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1771) streaming hang when IOException in MROutputThread. (NPE)

2007-10-12 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1771:
-

Attachment: h-1771-3.patch2

Sorry about that. FIxed and updated the patch

> streaming hang when IOException in MROutputThread. (NPE)
> 
>
> Key: HADOOP-1771
> URL: https://issues.apache.org/jira/browse/HADOOP-1771
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: Koji Noguchi
>Assignee: Michel Tourn
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: h-1771-2.patch, h-1771-3.patch, h-1771-3.patch, 
> h-1771-3.patch, h-1771-3.patch2, h-1771.patch
>
>
> One streaming task hang and had stderr userlog as follows.
> {code}
> Exception in thread "Thread-5" java.lang.NullPointerException
>  at java.lang.Throwable.printStackTrace(Throwable.java:460)
>  at 
> org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:352)
> {code}
> In PipeMapRed.java
> {code}
> 351   } catch (IOException io) {
> 352 io.printStackTrace(log_);
> 353 outerrThreadsThrowable = io;
> {code}
> I guess log_ is  Null... Should call logStackTrace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1771) streaming hang when IOException in MROutputThread. (NPE)

2007-10-10 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1771:
-

Affects Version/s: (was: 0.13.1)
   0.14.1
   Status: Patch Available  (was: Open)

making patch available.
Thanks

> streaming hang when IOException in MROutputThread. (NPE)
> 
>
> Key: HADOOP-1771
> URL: https://issues.apache.org/jira/browse/HADOOP-1771
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: h-1771-2.patch, h-1771-3.patch, h-1771-3.patch, 
> h-1771-3.patch, h-1771.patch
>
>
> One streaming task hang and had stderr userlog as follows.
> {code}
> Exception in thread "Thread-5" java.lang.NullPointerException
>  at java.lang.Throwable.printStackTrace(Throwable.java:460)
>  at 
> org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:352)
> {code}
> In PipeMapRed.java
> {code}
> 351   } catch (IOException io) {
> 352 io.printStackTrace(log_);
> 353 outerrThreadsThrowable = io;
> {code}
> I guess log_ is  Null... Should call logStackTrace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1771) streaming hang when IOException in MROutputThread. (NPE)

2007-10-10 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1771:
-

Attachment: h-1771-3.patch

With inputs from Owen, the patch has changed closing of streaming in main body 
of run() and log the IOException as well.
Thanks

> streaming hang when IOException in MROutputThread. (NPE)
> 
>
> Key: HADOOP-1771
> URL: https://issues.apache.org/jira/browse/HADOOP-1771
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.13.1
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: h-1771-2.patch, h-1771-3.patch, h-1771-3.patch, 
> h-1771-3.patch, h-1771.patch
>
>
> One streaming task hang and had stderr userlog as follows.
> {code}
> Exception in thread "Thread-5" java.lang.NullPointerException
>  at java.lang.Throwable.printStackTrace(Throwable.java:460)
>  at 
> org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:352)
> {code}
> In PipeMapRed.java
> {code}
> 351   } catch (IOException io) {
> 352 io.printStackTrace(log_);
> 353 outerrThreadsThrowable = io;
> {code}
> I guess log_ is  Null... Should call logStackTrace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1771) streaming hang when IOException in MROutputThread. (NPE)

2007-10-10 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1771:
-

Attachment: h-1771-3.patch

Updated with alignment change.
Thanks!

> streaming hang when IOException in MROutputThread. (NPE)
> 
>
> Key: HADOOP-1771
> URL: https://issues.apache.org/jira/browse/HADOOP-1771
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.13.1
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: h-1771-2.patch, h-1771-3.patch, h-1771-3.patch, 
> h-1771.patch
>
>
> One streaming task hang and had stderr userlog as follows.
> {code}
> Exception in thread "Thread-5" java.lang.NullPointerException
>  at java.lang.Throwable.printStackTrace(Throwable.java:460)
>  at 
> org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:352)
> {code}
> In PipeMapRed.java
> {code}
> 351   } catch (IOException io) {
> 352 io.printStackTrace(log_);
> 353 outerrThreadsThrowable = io;
> {code}
> I guess log_ is  Null... Should call logStackTrace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1771) streaming hang when IOException in MROutputThread. (NPE)

2007-10-09 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1771:
-

Attachment: h-1771-3.patch

Thanks to Koji, while debugging a similar problem, we saw similar case where 
MROutputThread was done, but  the map task was still hung
Upon killing the streaming map executable, we saw there was a problem with the 
thread so it had terminated. 
Looks like the streaming map executable was still trying to write and was hung. 

The problem could be that clientIn_ and clientErr_ are not closed when their 
thread is done, which causes this problem.
Attached is an updated patch, which makes sure it will close the streams when 
their threads exit.

Please could anyone review/comment.
Thanks.

> streaming hang when IOException in MROutputThread. (NPE)
> 
>
> Key: HADOOP-1771
> URL: https://issues.apache.org/jira/browse/HADOOP-1771
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.13.1
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Fix For: 0.15.0
>
> Attachments: h-1771-2.patch, h-1771-3.patch, h-1771.patch
>
>
> One streaming task hang and had stderr userlog as follows.
> {code}
> Exception in thread "Thread-5" java.lang.NullPointerException
>  at java.lang.Throwable.printStackTrace(Throwable.java:460)
>  at 
> org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:352)
> {code}
> In PipeMapRed.java
> {code}
> 351   } catch (IOException io) {
> 352 io.printStackTrace(log_);
> 353 outerrThreadsThrowable = io;
> {code}
> I guess log_ is  Null... Should call logStackTrace.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1626) DFSAdmin. Help messages are missing for -finalizeUpgrade and -metasave.

2007-10-02 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1626:
-

Attachment: HADOOP-1626.2.patch

Updated patch with few more changes suggested by Nigel.
Thanks!

> DFSAdmin. Help messages are missing for -finalizeUpgrade and -metasave.
> ---
>
> Key: HADOOP-1626
> URL: https://issues.apache.org/jira/browse/HADOOP-1626
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Affects Versions: 0.13.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: HADOOP-1626.2.patch, HADOOP-1626.patch
>
>
> DFSAdmin.printHelp() does not print help for the two commands above.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1626) DFSAdmin. Help messages are missing for -finalizeUpgrade and -metasave.

2007-10-02 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1626:
-

Attachment: HADOOP-1626.patch

Attached is a patch which includes help message for -finalizeUpgrade and 
-metasave

Below is the help message.

[EMAIL PROTECTED] hadoop-trunk]$ hadoop dfsadmin -help finalizeUpgrade
-finalizeUpgrade: Finalize upgrade of DFS.
Datanodes delete their old version working directories,
followed by namenode removing its own and checkpoints in it.
This completes the upgrade process.

[EMAIL PROTECTED] hadoop-trunk]$ hadoop dfsadmin -help metasave
-metasave :   Save namenode's primary data structures
to  in directory specified by hadoop.log.dir property.
 contains one line for each of the following
1. Datanodes heart beating with Namenode
2. Blocks waiting to be replicated
3. Blocks currrently being replicated
4. Blocls waiting to be deleted

[EMAIL PROTECTED] hadoop-trunk]$

> DFSAdmin. Help messages are missing for -finalizeUpgrade and -metasave.
> ---
>
> Key: HADOOP-1626
> URL: https://issues.apache.org/jira/browse/HADOOP-1626
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Affects Versions: 0.13.0
>Reporter: Konstantin Shvachko
>Priority: Blocker
> Fix For: 0.15.0
>
> Attachments: HADOOP-1626.patch
>
>
> DFSAdmin.printHelp() does not print help for the two commands above.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1977) hadoop job -kill , -status causes NullPointerException

2007-10-02 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531709
 ] 

lohit vijayarenu commented on HADOOP-1977:
--

Hi Enis,

I tried your patch on 0.14.1 and both -status and -kill works. 

Thanks!

> hadoop job -kill , -status causes NullPointerException
> --
>
> Key: HADOOP-1977
> URL: https://issues.apache.org/jira/browse/HADOOP-1977
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Assignee: Enis Soztutar
>Priority: Blocker
> Fix For: 0.14.2
>
> Attachments: NPEinJobClient_v1.hadoop-0.14.patch, 
> NPEinJobClient_v1.patch
>
>
> hadoop job -kill/-status seems to cause NullPointerException
> As an example, I started a streaming job and tried to kill it. This raises 
> NullPointerException
> [EMAIL PROTECTED] mapred]$ bin/hadoop job  -Dmapred.job.tracker=kry1443:56225 
> -kill job_200710011856_0001
> 07/10/01 18:57:07 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
> at org.apache.hadoop.mapred.JobClient.run(JobClient.java:791)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
> [EMAIL PROTECTED] mapred]$ 
> So does 'hadoop job -status'
> [EMAIL PROTECTED] mapred]$hadoop job -Dmapred.job.tracker=kry1443:56225 
> -status job_200710011856_0001
> 07/10/01 18:57:21 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
> at org.apache.hadoop.mapred.JobClient.run(JobClient.java:782)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
> [EMAIL PROTECTED] mapred]$

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1977) hadoop job -kill , -status causes NullPointerException

2007-10-01 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1977:
-

Component/s: mapred
Summary: hadoop job -kill , -status causes NullPointerException  (was: 
hadoop job -kill , -status )

> hadoop job -kill , -status causes NullPointerException
> --
>
> Key: HADOOP-1977
> URL: https://issues.apache.org/jira/browse/HADOOP-1977
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
> Fix For: 0.14.2
>
>
> hadoop job -kill/-status seems to cause NullPointerException
> As an example, I started a streaming job and tried to kill it. This raises 
> NullPointerException
> [EMAIL PROTECTED] mapred]$ bin/hadoop job  -Dmapred.job.tracker=kry1443:56225 
> -kill job_200710011856_0001
> 07/10/01 18:57:07 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
> at org.apache.hadoop.mapred.JobClient.run(JobClient.java:791)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
> [EMAIL PROTECTED] mapred]$ 
> So does 'hadoop job -status'
> [EMAIL PROTECTED] mapred]$hadoop job -Dmapred.job.tracker=kry1443:56225 
> -status job_200710011856_0001
> 07/10/01 18:57:21 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
> at org.apache.hadoop.mapred.JobClient.run(JobClient.java:782)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
> [EMAIL PROTECTED] mapred]$

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1977) hadoop job -kill , -status causes NullPointerException

2007-10-01 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1977:
-

Priority: Blocker  (was: Major)

> hadoop job -kill , -status causes NullPointerException
> --
>
> Key: HADOOP-1977
> URL: https://issues.apache.org/jira/browse/HADOOP-1977
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Priority: Blocker
> Fix For: 0.14.2
>
>
> hadoop job -kill/-status seems to cause NullPointerException
> As an example, I started a streaming job and tried to kill it. This raises 
> NullPointerException
> [EMAIL PROTECTED] mapred]$ bin/hadoop job  -Dmapred.job.tracker=kry1443:56225 
> -kill job_200710011856_0001
> 07/10/01 18:57:07 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
> at org.apache.hadoop.mapred.JobClient.run(JobClient.java:791)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
> [EMAIL PROTECTED] mapred]$ 
> So does 'hadoop job -status'
> [EMAIL PROTECTED] mapred]$hadoop job -Dmapred.job.tracker=kry1443:56225 
> -status job_200710011856_0001
> 07/10/01 18:57:21 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
> at 
> org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
> at org.apache.hadoop.mapred.JobClient.run(JobClient.java:782)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
> [EMAIL PROTECTED] mapred]$

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-1977) hadoop job -kill , -status

2007-10-01 Thread lohit vijayarenu (JIRA)

hadoop job -kill , -status 
---

 Key: HADOOP-1977
 URL: https://issues.apache.org/jira/browse/HADOOP-1977
 Project: Hadoop
  Issue Type: Bug
Affects Versions: 0.14.1
Reporter: lohit vijayarenu
 Fix For: 0.14.2


hadoop job -kill/-status seems to cause NullPointerException

As an example, I started a streaming job and tried to kill it. This raises 
NullPointerException

[EMAIL PROTECTED] mapred]$ bin/hadoop job  -Dmapred.job.tracker=kry1443:56225 
-kill job_200710011856_0001
07/10/01 18:57:07 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=JobTracker, sessionId=
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
at 
org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
at org.apache.hadoop.mapred.JobClient.run(JobClient.java:791)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
[EMAIL PROTECTED] mapred]$ 

So does 'hadoop job -status'
[EMAIL PROTECTED] mapred]$hadoop job -Dmapred.job.tracker=kry1443:56225 -status 
job_200710011856_0001
07/10/01 18:57:21 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=JobTracker, sessionId=
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.access$600(LocalJobRunner.java:51)
at 
org.apache.hadoop.mapred.LocalJobRunner.getJobStatus(LocalJobRunner.java:296)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:512)
at org.apache.hadoop.mapred.JobClient.run(JobClient.java:782)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.mapred.JobClient.main(JobClient.java:827)
[EMAIL PROTECTED] mapred]$



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-1967) hadoop dfs -ls, -get, -mv command's source/destination URI are inconsistent

2007-09-28 Thread lohit vijayarenu (JIRA)

hadoop dfs -ls, -get, -mv command's source/destination URI are inconsistent
---

 Key: HADOOP-1967
 URL: https://issues.apache.org/jira/browse/HADOOP-1967
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Affects Versions: 0.14.1
Reporter: lohit vijayarenu


While specifying source/destination path for hadoop dfs -ls, -get, -mv, -cp 
commands, we have some inconsistency related to 'hdfs://' scheme.

Particularly, few of the commands accept both formats
[1] hdfs:///user/lohit/testfile
[2] hdfs://myhost:8020/user/lohit/testfile

and few commands accept only paths, which have authority (host:port)
[2] hdfs://myhost:8020/user/lohit/testfile

below are examples
hadoop dfs -ls  (works for both formats)
{quote}
[EMAIL PROTECTED] ~]$ hadoop dfs -ls hdfs://kry-nn1:8020/user/lohit/ranges
Found 1 items
/user/lohit/ranges 24  1970-01-01 00:00
[EMAIL PROTECTED] ~]$ hadoop dfs -ls hdfs:///user/lohit/ranges
Found 1 items
{quote}


hadoop dfs -get (works for only format [2])
{quote}
[EMAIL PROTECTED] ~]$ hadoop dfs -get hdfs:///user/lohit/ranges .
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS:
hdfs:/user/lohit/ranges, expected: hdfs://kry-nn1:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:204)
at
org.apache.hadoop.dfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:108)
at
org.apache.hadoop.dfs.DistributedFileSystem.getPath(DistributedFileSystem.java:104)
at
org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:319)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:423)
at org.apache.hadoop.fs.FsShell.copyToLocal(FsShell.java:177)
at org.apache.hadoop.fs.FsShell.copyToLocal(FsShell.java:155)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1233)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
[EMAIL PROTECTED] ~]$ hadoop dfs -get hdfs://kry-nn1:8020/user/lohit/ranges .
[EMAIL PROTECTED] ~]$ ls ./ranges
./ranges
[EMAIL PROTECTED] ~]$
{quote}

hadoop dfs -mv / -cp command. source path accepts both format [1] and [2], 
while destination accepts only [2].
{quote}
[EMAIL PROTECTED] ~]$ hadoop dfs -cp hdfs://kry-nn1:8020/user/lohit/ranges.test2
hdfs:///user/lohit/ranges.test
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS:
hdfs:/user/lohit/ranges.test, expected: hdfs://kry-nn1:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:204)
at
org.apache.hadoop.dfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:108)
at
org.apache.hadoop.dfs.DistributedFileSystem.getPath(DistributedFileSystem.java:104)
at
org.apache.hadoop.dfs.DistributedFileSystem.exists(DistributedFileSystem.java:162)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:269)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:117)
at org.apache.hadoop.fs.FsShell.copy(FsShell.java:691)
at org.apache.hadoop.fs.FsShell.copy(FsShell.java:727)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1260)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
[EMAIL PROTECTED] ~]$ hadoop dfs -cp hdfs:///user/lohit/ranges.test2
hdfs://kry-nn1:8020/user/lohit/ranges.test
[EMAIL PROTECTED] ~]$ 
{quote}

We should have a consistent URI naming convention across all commands. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-09-26 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1952:
-

Attachment: CatchInvalidInputFormat.patch

Attached is a simple patch to fix this. This patch also adds check for 
class.getSimpleName() while checking for inputformat class. In that case, users 
do not have to specify full class path for standard Class, instead they could 
just provide the simple class name.
Thanks

> Streaming does not handle invalid -inputformat  (typo by users for example)
> ---
>
> Key: HADOOP-1952
> URL: https://issues.apache.org/jira/browse/HADOOP-1952
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.14.1
>Reporter: lohit vijayarenu
>Priority: Minor
> Attachments: CatchInvalidInputFormat.patch
>
>
> Hadoop Streaming does not handle invalid inputformat class. For example 
> -inputformat INVALID class would not be thrown as an error. Instead it 
> defaults to StreamInputFormat. If an invalid inputformat is specified, it is 
> good to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-1952) Streaming does not handle invalid -inputformat (typo by users for example)

2007-09-26 Thread lohit vijayarenu (JIRA)

Streaming does not handle invalid -inputformat  (typo by users for example)
---

 Key: HADOOP-1952
 URL: https://issues.apache.org/jira/browse/HADOOP-1952
 Project: Hadoop
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.14.1
Reporter: lohit vijayarenu
Priority: Minor


Hadoop Streaming does not handle invalid inputformat class. For example 
-inputformat INVALID class would not be thrown as an error. Instead it defaults 
to StreamInputFormat. If an invalid inputformat is specified, it is good to 
fail. 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1210) Log counters in job history

2007-08-30 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1210:
-

Affects Version/s: 0.14.0

> Log counters in job history
> ---
>
> Key: HADOOP-1210
> URL: https://issues.apache.org/jira/browse/HADOOP-1210
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Affects Versions: 0.14.0
>Reporter: Albert Chern
>Priority: Minor
> Attachments: counters_output, h-1210-job-and-task-status.patch, 
> h-1210-job-and-task-status.patch, h-1210-jobstatus.patch
>
>
> It would be useful if the value of the global counters were logged to the job 
> history, perhaps even individually for each task after completion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-1781) Need more complete API of JobClient class

2007-08-24 Thread lohit vijayarenu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522652
 ] 

lohit edited comment on HADOOP-1781 at 8/24/07 2:30 PM:
---

(HADOOP-1210) Similar issue, where for now we are trying to dump details of 
tasks to stdout via hadoop job -status  [-taskdetails] option


  was (Author: lohit):
Similar issue, where for now we are trying to dump details of tasks to 
stdout via hadoop job -status  [-taskdetails] option
  
> Need more complete API of JobClient class
> -
>
> Key: HADOOP-1781
> URL: https://issues.apache.org/jira/browse/HADOOP-1781
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Runping Qi
>
> We need a programmatic way to find out the information about a map/reduce 
> cluster and the jobs on the cluster.
> The current API is not complete.
> In particular, the following API functions are needed:
> 1. jobs()  currently, there is an API function JobsToComplete, which returns 
> running/waiting jobs only.  jobs() should return the complete list.
> 2. TaskReport[] getMap/ReduceTaskReports(String jobid)
> 3. getStartTime()
> 4. getJobStatus(String jobid);
> 5. getJobProfile(String jobid);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1210) Log counters in job history

2007-08-24 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1210:
-

Attachment: h-1210-job-and-task-status.patch

Thanks Doug. Even Milind suggested the same and here is an updated patch. 
Instead of -counters the patch has modified it to -taskdetails.
If provided it would dump start/end/wall time along with counters.  This time 
in format which could be parsed easily. 

> Log counters in job history
> ---
>
> Key: HADOOP-1210
> URL: https://issues.apache.org/jira/browse/HADOOP-1210
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Albert Chern
>Priority: Minor
> Attachments: counters_output, h-1210-job-and-task-status.patch, 
> h-1210-job-and-task-status.patch, h-1210-jobstatus.patch
>
>
> It would be useful if the value of the global counters were logged to the job 
> history, perhaps even individually for each task after completion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1210) Log counters in job history

2007-08-24 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1210:
-

Attachment: h-1210-job-and-task-status.patch

Updated patch included. 

> Log counters in job history
> ---
>
> Key: HADOOP-1210
> URL: https://issues.apache.org/jira/browse/HADOOP-1210
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Albert Chern
>Priority: Minor
> Attachments: counters_output, h-1210-job-and-task-status.patch, 
> h-1210-jobstatus.patch
>
>
> It would be useful if the value of the global counters were logged to the job 
> history, perhaps even individually for each task after completion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1210) Log counters in job history

2007-08-24 Thread lohit vijayarenu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit vijayarenu updated HADOOP-1210:
-

Attachment: counters_output

Thanks to Owen and Milind, made few more changes and added a new option
hadoop job -status  [-counters]

If -counters option is specified, then the patch dumps Global Counters and 
Counters of each Map/Reduce task to STDOUT.
I am attaching a sample output, have tried to organize it so that it could be 
parsed later by another program/script.

> Log counters in job history
> ---
>
> Key: HADOOP-1210
> URL: https://issues.apache.org/jira/browse/HADOOP-1210
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Albert Chern
>Priority: Minor
> Attachments: counters_output, h-1210-job-and-task-status.patch, 
> h-1210-jobstatus.patch
>
>
> It would be useful if the value of the global counters were logged to the job 
> history, perhaps even individually for each task after completion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

83 matches

Mail list logo