[jira] Created: (HADOOP-2669) DFS client lost lease during writing into DFS files

2008-01-20 Thread Runping Qi (JIRA)
Reporter: Runping Qi I have a program that reads a block compressed sequence file, does some processing on the records and writes the processed records into another block compressed sequence file. During execution of the program, I got the following exception

[jira] Commented: (HADOOP-2663) Buffer class' toString method should accept a code name for "true" utf-8 codeName

2008-01-18 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560637#action_12560637 ] Runping Qi commented on HADOOP-2663: On the flip side, the Buffer should provi

[jira] Created: (HADOOP-2663) Buffer class' toString method should accept a code name for "true" utf-8 codeName

2008-01-18 Thread Runping Qi (JIRA)
Project: Hadoop Issue Type: Improvement Components: record Reporter: Runping Qi Assignee: Milind Bhandarkar Currently, if one call toString("UTF-8"), a String object is created using Java's converion code. That does not work properl

[jira] Commented: (HADOOP-2608) Reading sequence file consumes 100% cpu with maximum throughput being about 5MB/sec per process

2008-01-16 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559815#action_12559815 ] Runping Qi commented on HADOOP-2608: Forgot one point: when scanning sequence f

[jira] Commented: (HADOOP-2608) Reading sequence file consumes 100% cpu with maximum throughput being about 5MB/sec per process

2008-01-16 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559812#action_12559812 ] Runping Qi commented on HADOOP-2608: I profiled the program of reading sequ

[jira] Created: (HADOOP-2608) Reading sequence file consumes 100% cpu with maximum throughput being about 5MB/sec per process

2008-01-14 Thread Runping Qi (JIRA)
-2608 Project: Hadoop Issue Type: Improvement Components: io Reporter: Runping Qi I did some tests on the throughput of scanning block-compressed sequence files. The sustained throughput was bounded at 5MB/sec per process, with the cpu of each process

[jira] Commented: (HADOOP-2491) generalize the TT / JT servers to handle more generic tasks

2008-01-14 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558749#action_12558749 ] Runping Qi commented on HADOOP-2491: A great analysis. +2 Especially like

[jira] Commented: (HADOOP-2581) Counters and other useful stats should be logged into Job History log

2008-01-11 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558147#action_12558147 ] Runping Qi commented on HADOOP-2581: Cool. That will address the first 2 i

[jira] Created: (HADOOP-2581) Counters and other useful stats should be logged into Job History log

2008-01-11 Thread Runping Qi (JIRA)
: Improvement Components: mapred Reporter: Runping Qi The following stats are useful and available to JT but not logged job history log: 1. The counters of each job 2. The counters of each mapper/reducer attempt 3. The info about the input splits (filename, split size, on

[jira] Commented: (HADOOP-2178) Job history on HDFS

2008-01-11 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558029#action_12558029 ] Runping Qi commented on HADOOP-2178: Even with hod JT, we still need to address

[jira] Commented: (HADOOP-2570) streaming jobs fail after HADOOP-2227

2008-01-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557720#action_12557720 ] Runping Qi commented on HADOOP-2570: Lohit's suggestion should work. &g

[jira] Created: (HADOOP-2560) Combining multiple input blocks into one mapper

2008-01-09 Thread Runping Qi (JIRA)
Combining multiple input blocks into one mapper --- Key: HADOOP-2560 URL: https://issues.apache.org/jira/browse/HADOOP-2560 Project: Hadoop Issue Type: Bug Reporter: Runping Qi

[jira] Created: (HADOOP-2559) DFS should place one replica per rack

2008-01-09 Thread Runping Qi (JIRA)
: Runping Qi Currently, when writing out a block, dfs will place one copy to a local data node, one copy to a rack local node and another one to a remote node. This leads to a number of undesired properties: 1. The block will be rack-local to two tacks instead of three, reducing the advantage of

[jira] Commented: (HADOOP-1876) Persisting completed jobs status

2008-01-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557292#action_12557292 ] Runping Qi commented on HADOOP-1876: I am fine with the approach of this patch i

[jira] Assigned: (HADOOP-2094) DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block

2008-01-08 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi reassigned HADOOP-2094: -- Assignee: dhruba borthakur > DFS should not use round robin policy in determing on wh

[jira] Assigned: (HADOOP-2014) Job Tracker should not clobber the data locality of tasks

2008-01-08 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi reassigned HADOOP-2014: -- Assignee: Devaraj Das > Job Tracker should not clobber the data locality of ta

[jira] Assigned: (HADOOP-2144) Data node process consumes 180% cpu

2008-01-08 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi reassigned HADOOP-2144: -- Assignee: dhruba borthakur > Data node process consumes 180%

[jira] Commented: (HADOOP-2178) Job history on HDFS

2008-01-08 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556914#action_12556914 ] Runping Qi commented on HADOOP-2178: +1. This is much clearer. Need coordinatio

[jira] Commented: (HADOOP-2178) Job history on HDFS

2008-01-07 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556630#action_12556630 ] Runping Qi commented on HADOOP-2178: BTW, Is this issue the same as H-1876 (h

[jira] Commented: (HADOOP-1876) Persisting completed jobs status

2008-01-07 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556584#action_12556584 ] Runping Qi commented on HADOOP-1876: The point is that you don't need

[jira] Commented: (HADOOP-1876) Persisting completed jobs status

2008-01-06 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556498#action_12556498 ] Runping Qi commented on HADOOP-1876: You can add counter information to JobHis

[jira] Commented: (HADOOP-2178) Job history on HDFS

2008-01-02 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555439#action_12555439 ] Runping Qi commented on HADOOP-2178: The output data may be deleted anytime whe

[jira] Commented: (HADOOP-2501) Implement utility-tools for working with SequenceFiles

2008-01-02 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555372#action_12555372 ] Runping Qi commented on HADOOP-2501: bin/hadoop seq -head assumes the key/v

[jira] Commented: (HADOOP-1298) adding user info to file

2007-12-21 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12553958 ] Runping Qi commented on HADOOP-1298: Since the file sizes to read/write are very small, the NNBench should not

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

2007-12-16 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552244 ] Runping Qi commented on HADOOP-2437: Similar problem due to round robin placement policy happens in DFS data

[jira] Commented: (HADOOP-1336) turn on speculative execution by defaul

2007-12-15 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552074 ] Runping Qi commented on HADOOP-1336: Speculative execution does not work well without the patch for hadoop

[jira] Commented: (HADOOP-2433) Streaming: org.apache.hadoop.mapred.lib.IdentityMapper should not inserted unnecessary keys

2007-12-14 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552033 ] Runping Qi commented on HADOOP-2433: There is already a proper input format class: KeyValueTextInputFormat for

[jira] Commented: (HADOOP-2429) The lowest level map-reduce APIs should be byte oriented

2007-12-14 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551923 ] Runping Qi commented on HADOOP-2429: +1 To enforce that, the API should use concrete classes (such as Buffer

[jira] Commented: (HADOOP-2369) Representative mix of jobs for large cluster throughput benchmarking

2007-12-11 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550763 ] Runping Qi commented on HADOOP-2369: +1 > Representative mix of jobs for large cluster throughput benchmark

[jira] Created: (HADOOP-2403) JobHistory log files contain data that cannot be parsed by org.apache.hadoop.mapred.JobHistory

2007-12-10 Thread Runping Qi (JIRA)
Project: Hadoop Issue Type: Bug Components: mapred Reporter: Runping Qi When some tasks failed, the job tracker writes an line to the history file with error message. However, the error message may mess up with the history file format, choking the

[jira] Commented: (HADOOP-2141) speculative execution start up condition based on completion time

2007-11-15 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542896 ] Runping Qi commented on HADOOP-2141: I don't think this Jira is that urgent and we have to have a quick

[jira] Commented: (HADOOP-2141) speculative execution start up condition based on completion time

2007-11-15 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542861 ] Runping Qi commented on HADOOP-2141: The above proposal sounds reasonable. Here are some points to consider

[jira] Commented: (HADOOP-1984) some reducer stuck at copy phase and progress extremely slowly

2007-11-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541828 ] Runping Qi commented on HADOOP-1984: It will be really helpful if we know the overall job progress status

[jira] Commented: (HADOOP-1984) some reducer stuck at copy phase and progress extremely slowly

2007-11-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541817 ] Runping Qi commented on HADOOP-1984: Ten minutes waiting interval seems too much. When the interval reach a

[jira] Commented: (HADOOP-2141) speculative execution start up condition based on completion time

2007-11-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541388 ] Runping Qi commented on HADOOP-2141: +1 Speculative execution should start of the original execution is a lot

[jira] Commented: (HADOOP-2175) Blacklisted hosts may not be able to serve map outputs

2007-11-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541386 ] Runping Qi commented on HADOOP-2175: The criteria for starting a speculative execution should also include

[jira] Created: (HADOOP-2180) Each task tracker should not execute more than one speculative task

2007-11-09 Thread Runping Qi (JIRA)
Components: mapred Reporter: Runping Qi I noticed that sometimes, a tasktracker started 2 or three speculative mapper tasks. That seems counter productive. You want to speculative execution complete as soon as possible. Thus, it is better to spread speculative execution

[jira] Commented: (HADOOP-2175) Blacklisted hosts may not be able to serve map outputs

2007-11-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541350 ] Runping Qi commented on HADOOP-2175: then we need to try the patch for hadoop-1984. Currently, a job may stall

[jira] Commented: (HADOOP-2175) Blacklisted hosts may not be able to serve map outputs

2007-11-08 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541240 ] Runping Qi commented on HADOOP-2175: I was using hadoop 0.15.0. is the patch for hadoop-1158 in 0.15.0? if so

[jira] Created: (HADOOP-2177) Speculative execution does not work properly

2007-11-08 Thread Runping Qi (JIRA)
Reporter: Runping Qi One mapper of my job stuck when it reached 87.7%. Speculative execution was set to true. But no speculative execution was fired for that task. The whole job was stalled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the

[jira] Created: (HADOOP-2175) Blacklisted hosts may not be able to serve map outputs

2007-11-08 Thread Runping Qi (JIRA)
: mapred Reporter: Runping Qi After a node fails 4 mappers (tasks), it is added to blacklist thus it will no longer accept tasks. But, it will continue serve the map outputs of any mappers that ran successfully there. However, the node may not be able serve the map outputs either. This

[jira] Commented: (HADOOP-2164) Reducer sort failed due to wrong key class

2007-11-07 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540774 ] Runping Qi commented on HADOOP-2164: My input data is in SequenceFile. Only one map outputt segment file had

[jira] Commented: (HADOOP-2164) Reducer sort failed due to wrong key class

2007-11-07 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540772 ] Runping Qi commented on HADOOP-2164: All attempts failed. The job failed eventually. > Reducer sort fai

[jira] Created: (HADOOP-2164) Reducer sort failed due to wrong key class

2007-11-06 Thread Runping Qi (JIRA)
: Runping Qi One of my job's reducers failed due to the following exception: java.io.IOException: wrong key class: class org.apache.hadoop.io.LongWritable is not class org.apache.hadoop.io.Text at org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile

[jira] Created: (HADOOP-2163) mapper failed due to exceptions

2007-11-06 Thread Runping Qi (JIRA)
mapper failed due to exceptions --- Key: HADOOP-2163 URL: https://issues.apache.org/jira/browse/HADOOP-2163 Project: Hadoop Issue Type: Bug Components: mapred Reporter: Runping Qi >F

[jira] Commented: (HADOOP-2144) Data node process consumes 180% cpu

2007-11-02 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539635 ] Runping Qi commented on HADOOP-2144: Overall cpu usage is 90+%. It is easy to reproduce. > Data n

[jira] Created: (HADOOP-2144) Data node process consumes 180% cpu

2007-11-02 Thread Runping Qi (JIRA)
Data node process consumes 180% cpu Key: HADOOP-2144 URL: https://issues.apache.org/jira/browse/HADOOP-2144 Project: Hadoop Issue Type: Improvement Reporter: Runping Qi I did a test on

[jira] Updated: (HADOOP-2144) Data node process consumes 180% cpu

2007-11-02 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2144: --- Component/s: dfs Description: I did a test on DFS read throughput and found that the data node

[jira] Resolved: (HADOOP-2060) DFSClient should choose a block that is local to the node where the client is running

2007-11-02 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi resolved HADOOP-2060. Resolution: Invalid The suggested feature is already in. > DFSClient should choose a bl

[jira] Commented: (HADOOP-2086) ability to add dependencies to a job after construction

2007-11-02 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539581 ] Runping Qi commented on HADOOP-2086: +1 Looks good. > ability to add dependencies to a job af

[jira] Commented: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast

2007-10-29 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538678 ] Runping Qi commented on HADOOP-2119: Yes, the #running-mappers kept on going up and went beyond the actual

[jira] Created: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast

2007-10-29 Thread Runping Qi (JIRA)
Issue Type: Bug Components: mapred Reporter: Runping Qi Fix For: 0.15.0 I ran a job with 0 reducer on a cluster with 390 nodes. The mappers ran very fast. The jobtracker lacks behind on committing completed mapper tasks. The number of running mappers displayed

[jira] Commented: (HADOOP-2086) ability to add dependencies to a job after construction

2007-10-29 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538534 ] Runping Qi commented on HADOOP-2086: There is no real harm to make getState synchronized, although either way

[jira] Commented: (HADOOP-2086) ability to add dependencies to a job after construction

2007-10-26 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538051 ] Runping Qi commented on HADOOP-2086: +1. > ability to add dependencies to a job after construct

[jira] Commented: (HADOOP-2086) ability to add dependencies to a job after construction

2007-10-26 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538017 ] Runping Qi commented on HADOOP-2086: Right. WAITING is the only valid state we can add a dependending job

[jira] Issue Comment Edited: (HADOOP-2086) ability to add dependencies to a job after construction

2007-10-25 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537706 ] runping edited comment on HADOOP-2086 at 10/25/07 1:54 PM: -- I like 1). Yes, there is a

[jira] Commented: (HADOOP-2086) ability to add dependencies to a job after construction

2007-10-25 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537706 ] Runping Qi commented on HADOOP-2086: I like 1). Yes, there is a possibility of race condition when the

[jira] Commented: (HADOOP-2086) ability to add dependencies to a job after construction

2007-10-25 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537653 ] Runping Qi commented on HADOOP-2086: +0 the code looks good. However, the semantics of the new api

[jira] Commented: (HADOOP-2095) Reducer failed due to Out ofMemory

2007-10-24 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537492 ] Runping Qi commented on HADOOP-2095: Yes. > Reducer failed due to Out ofMem

[jira] Commented: (HADOOP-2095) Reducer failed due to Out ofMemory

2007-10-23 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537162 ] Runping Qi commented on HADOOP-2095: The problem was gone after I set the compressMapOutput attribute to false

[jira] Updated: (HADOOP-2095) Reducer failed due to Out ofMemory

2007-10-23 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2095: --- Component/s: mapred Fix Version/s: 0.15.0 Description: One of the reducers of my job

[jira] Created: (HADOOP-2095) Reducer failed due to Out ofMemory

2007-10-23 Thread Runping Qi (JIRA)
Reducer failed due to Out ofMemory -- Key: HADOOP-2095 URL: https://issues.apache.org/jira/browse/HADOOP-2095 Project: Hadoop Issue Type: Bug Reporter: Runping Qi One of the reducers of my job

[jira] Assigned: (HADOOP-2094) DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block

2007-10-23 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi reassigned HADOOP-2094: -- Assignee: (was: Runping Qi) > DFS should not use round robin policy in determing on wh

[jira] Created: (HADOOP-2094) DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block

2007-10-23 Thread Runping Qi (JIRA)
URL: https://issues.apache.org/jira/browse/HADOOP-2094 Project: Hadoop Issue Type: Improvement Components: dfs Reporter: Runping Qi Assignee: Runping Qi When multiple file system partitions are configured for the data storage of a data

[jira] Updated: (HADOOP-2093) DFS should provide partition information for blocks, and map/reduce should schedule avoid schedule mappers with the splits off the same file system partition at the same

2007-10-23 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2093: --- Component/s: mapred dfs Description: The summary is a bit of long. But the

[jira] Created: (HADOOP-2093) DFS should provide partition information for blocks, and map/reduce should schedule avoid schedule mappers with the splits off the same file system partition at the same

2007-10-23 Thread Runping Qi (JIRA)
-- Key: HADOOP-2093 URL: https://issues.apache.org/jira/browse/HADOOP-2093 Project: Hadoop Issue Type: New Feature Reporter: Runping Qi The summary is a bit of long. But the

[jira] Commented: (HADOOP-1965) Handle map output buffers better

2007-10-17 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535529 ] Runping Qi commented on HADOOP-1965: It seems clear that threaded spill performed much better than sequence

[jira] Commented: (HADOOP-2060) DFSClient should choose a block that is local to the node where the client is running

2007-10-15 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534962 ] Runping Qi commented on HADOOP-2060: OK, that is the part I didn't get. I'll examine the locati

[jira] Created: (HADOOP-2060) DFSClient should choose a block that is local to the node where the client is running

2007-10-15 Thread Runping Qi (JIRA)
Project: Hadoop Issue Type: Bug Components: dfs Reporter: Runping Qi When I chase down the DFSClient code to see how the data locality impact the dfs read throughput, I realized that DFSClient does not use data locality info (at least not obvious to me) when it

[jira] Commented: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-14 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534716 ] Runping Qi commented on HADOOP-2050: It turned out to be a problem in CopyFile class. After a mapper got killed

[jira] Updated: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-14 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2050: --- Component/s: (was: dfs) mapred > distcp failed due to problem in creat

[jira] Commented: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-13 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534570 ] Runping Qi commented on HADOOP-2050: This problem does not happen if the dfs write load is low. When a few

[jira] Updated: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-13 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2050: --- Description: When I run a distcp program to copy files from one dfs to another, my job failed with

[jira] Updated: (HADOOP-2052) distcp mapper's status report misleading

2007-10-13 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2052: --- Description: When the mappers of distcp finish, the status page in the web gui reports the data

[jira] Updated: (HADOOP-2052) distcp mapper's status report misleading

2007-10-13 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2052: --- Component/s: mapred Description: When the mappers of distcp finish, the status page in the web

[jira] Created: (HADOOP-2052) distcp mapper's status report misleading

2007-10-13 Thread Runping Qi (JIRA)
distcp mapper's status report misleading Key: HADOOP-2052 URL: https://issues.apache.org/jira/browse/HADOOP-2052 Project: Hadoop Issue Type: Bug Reporter: Runping Qi When the ma

[jira] Commented: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534462 ] Runping Qi commented on HADOOP-2050: Some mappers failed with the following exception

[jira] Updated: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2050: --- Component/s: dfs I suspect this is a problem in dfs. > distcp failed due to problem in creat

[jira] Created: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-12 Thread Runping Qi (JIRA)
distcp failed due to problem in creating files -- Key: HADOOP-2050 URL: https://issues.apache.org/jira/browse/HADOOP-2050 Project: Hadoop Issue Type: Bug Reporter: Runping Qi When I

[jira] Updated: (HADOOP-2050) distcp failed due to problem in creating files

2007-10-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2050: --- Description: When I run a distcp program to copy files from one dfs to another, my job

[jira] Created: (HADOOP-2048) DISTCP mapper should report progress more often

2007-10-12 Thread Runping Qi (JIRA)
DISTCP mapper should report progress more often --- Key: HADOOP-2048 URL: https://issues.apache.org/jira/browse/HADOOP-2048 Project: Hadoop Issue Type: Bug Reporter: Runping Qi When

[jira] Commented: (HADOOP-2042) distcp job failed

2007-10-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534401 ] Runping Qi commented on HADOOP-2042: The current trunk. I guess it is 0.15 > distcp job fai

[jira] Updated: (HADOOP-2042) distcp job failed

2007-10-12 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2042: --- Component/s: dfs Description: I was running distcp to copy data from one dfs to another. The

[jira] Created: (HADOOP-2042) distcp job failed

2007-10-12 Thread Runping Qi (JIRA)
distcp job failed - Key: HADOOP-2042 URL: https://issues.apache.org/jira/browse/HADOOP-2042 Project: Hadoop Issue Type: Bug Affects Versions: 0.15.0 Reporter: Runping Qi I was running distcp to copy data

[jira] Created: (HADOOP-2032) distcp split generation does not work correctly

2007-10-11 Thread Runping Qi (JIRA)
Reporter: Runping Qi With the current implementation, distcp will always assign multiple files to one mapper to copy, no matter how large are the files. This is because the CopyFiles class uses a sequencefile to store the list of files to be copied, one record per file. CopyFile class

[jira] Commented: (HADOOP-2028) distcp fails if log dir not specified and destination not present

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533904 ] Runping Qi commented on HADOOP-2028: the job fails too if the specified log dir is not empty > distcp fa

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Status: Patch Available (was: Open) > Job tracker should report the number of splits that

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Attachment: (was: hadoop-2015.txt) > Job tracker should report the number of splits that

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Status: Open (was: Patch Available) a more optimized patch is available > Job tracker sho

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Attachment: hadoop-2015.txt a better one > Job tracker should report the number of splits t

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Attachment: hadoop-2015.txt Fix a bug in the patch > Job tracker should report the number

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Status: Patch Available (was: Open) > Job tracker should report the number of splits that

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Attachment: (was: hadoop-2015.txt) > Job tracker should report the number of splits that

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-10 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Status: Open (was: Patch Available) Need some more change > Job tracker should report the num

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Status: Patch Available (was: Open) > Job tracker should report the number of splits that

[jira] Updated: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2015: --- Attachment: hadoop-2015.txt a patch that adds a counter for the total number of tasks that may be

[jira] Assigned: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi reassigned HADOOP-2015: -- Assignee: Runping Qi > Job tracker should report the number of splits that are local to s

[jira] Created: (HADOOP-2015) Job tracker should report the number of splits that are local to some task trackers

2007-10-09 Thread Runping Qi (JIRA)
: Hadoop Issue Type: Improvement Components: mapred Reporter: Runping Qi Right now, jon tracker keeps track the number of launched mappers with local data. However, it is not clear how many mappers that are potentially be launched with data locality. This

[jira] Updated: (HADOOP-2014) Job Tracker should not clobber the data locality of tasks

2007-10-09 Thread Runping Qi (JIRA)
[ https://issues.apache.org/jira/browse/HADOOP-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated HADOOP-2014: --- Component/s: mapred Description: Currently, when the Job Tracker assigns a mapper task to a

[jira] Created: (HADOOP-2014) Job Tracker should not clobber the data locality of tasks

2007-10-09 Thread Runping Qi (JIRA)
: Runping Qi Currently, when the Job Tracker assigns a mapper task to a task tracker and there is no local split to the task tracker, the job tracker will find the first runable task in the mast task list and assign the task to the task tracker. The split for the task is not local to the task

  1   2   3   4   5   >