HDFS VFS Driver

2010-06-16 Thread Michael D'Amour
We have an open source ETL tool (Kettle) which uses VFS for many input/output steps/jobs. We would like to be able to read/write HDFS from Kettle using VFS. I haven't been able to find anything out there other than it would be nice. I had some time a few weeks ago to begin writing a VFS

Re: HDFS VFS Driver

2010-06-16 Thread Arun C Murthy
Michael, Please open a jira (new feature) and attach your patch there: http://wiki.apache.org/hadoop/HowToContribute thanks, Arun On Jun 16, 2010, at 8:55 AM, Michael D'Amour wrote: We have an open source ETL tool (Kettle) which uses VFS for many input/output steps/jobs. We would like to be

Re: HDFS VFS Driver

2010-06-16 Thread Dhruba Borthakur
hi mike, it will be nice to get a high level doc on what/how it is implemented. also, you might want to compare it with fufs-dfs http://wiki.apache.org/hadoop/MountableHDFS thanks, dhruba On Wed, Jun 16, 2010 at 8:55 AM, Michael D'Amour mdam...@pentaho.comwrote: We have an open source ETL

[jira] Created: (HDFS-1213) Implement a VFS Driver for HDFS

2010-06-16 Thread Michael D'Amour (JIRA)
Implement a VFS Driver for HDFS --- Key: HDFS-1213 URL: https://issues.apache.org/jira/browse/HDFS-1213 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client Reporter: Michael

[jira] Created: (HDFS-1214) hdfs client metadata cache

2010-06-16 Thread Joydeep Sen Sarma (JIRA)
hdfs client metadata cache -- Key: HDFS-1214 URL: https://issues.apache.org/jira/browse/HDFS-1214 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client Reporter: Joydeep Sen Sarma

[jira] Resolved: (HDFS-1215) TestNodeCount infinite loops on branch-20-append

2010-06-16 Thread Todd Lipcon (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-1215. --- Assignee: Todd Lipcon Resolution: Fixed Dhruba committed to 20-append branch TestNodeCount

[jira] Resolved: (HDFS-1216) Update to JUnit 4 in branch 20 append

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1216. Resolution: Fixed I just committed this. Thanks Todd! Update to JUnit 4 in branch 20

[jira] Created: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-16 Thread Todd Lipcon (JIRA)
20 append: Blocks recovered on startup should be treated with lower priority during block synchronization - Key: HDFS-1218 URL:

[jira] Resolved: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-142. --- Resolution: Fixed I have committed this. Thanks Sam, Nicolas and Todd. In 0.20, move blocks

[jira] Resolved: (HDFS-1141) completeFile does not check lease ownership

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1141. Resolution: Fixed Pulled into hadoop-0.20-append completeFile does not check lease

[jira] Resolved: (HDFS-1207) 0.20-append: stallReplicationWork should be volatile

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1207. Fix Version/s: 0.20-append Resolution: Fixed I just committed this. Thanks Todd!

[jira] Resolved: (HDFS-1210) DFSClient should log exception when block recovery fails

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1210. Fix Version/s: 0.20-append Resolution: Fixed I just committed this. Thanks Todd.

[jira] Resolved: (HDFS-1211) 0.20 append: Block receiver should not log rewind packets at INFO level

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1211. Resolution: Fixed I just committed this. Thanks Todd! 0.20 append: Block receiver should

[jira] Created: (HDFS-1219) Data Loss due to edits log truncation

2010-06-16 Thread Thanh Do (JIRA)
Data Loss due to edits log truncation - Key: HDFS-1219 URL: https://issues.apache.org/jira/browse/HDFS-1219 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions:

[jira] Created: (HDFS-1221) NameNode unable to start due to stale edits log after a crash

2010-06-16 Thread Thanh Do (JIRA)
NameNode unable to start due to stale edits log after a crash - Key: HDFS-1221 URL: https://issues.apache.org/jira/browse/HDFS-1221 Project: Hadoop HDFS Issue Type: Bug Affects

[jira] Created: (HDFS-1225) Block lost when primary crashes in recoverBlock

2010-06-16 Thread Thanh Do (JIRA)
Block lost when primary crashes in recoverBlock --- Key: HDFS-1225 URL: https://issues.apache.org/jira/browse/HDFS-1225 Project: Hadoop HDFS Issue Type: Bug Components: data-node

[jira] Created: (HDFS-1227) UpdateBlock fails due to unmatched file length

2010-06-16 Thread Thanh Do (JIRA)
UpdateBlock fails due to unmatched file length -- Key: HDFS-1227 URL: https://issues.apache.org/jira/browse/HDFS-1227 Project: Hadoop HDFS Issue Type: Bug Components: data-node

[jira] Created: (HDFS-1228) CRC does not match when retrying appending a partial block

2010-06-16 Thread Thanh Do (JIRA)
CRC does not match when retrying appending a partial block -- Key: HDFS-1228 URL: https://issues.apache.org/jira/browse/HDFS-1228 Project: Hadoop HDFS Issue Type: Bug

[jira] Created: (HDFS-1230) BlocksMap.blockinfo is not getting cleared immediately after deleting a block.This will be cleared only after block report comes from the datanode.Why we need to maintain t

2010-06-16 Thread Gokul (JIRA)
BlocksMap.blockinfo is not getting cleared immediately after deleting a block.This will be cleared only after block report comes from the datanode.Why we need to maintain the blockinfo till that time.

[jira] Created: (HDFS-1231) Generation Stamp mismatches, leading to failed append

2010-06-16 Thread Thanh Do (JIRA)
Generation Stamp mismatches, leading to failed append - Key: HDFS-1231 URL: https://issues.apache.org/jira/browse/HDFS-1231 Project: Hadoop HDFS Issue Type: Bug Components: hdfs

[jira] Created: (HDFS-1233) Bad retry logic at DFSClient

2010-06-16 Thread Thanh Do (JIRA)
Bad retry logic at DFSClient Key: HDFS-1233 URL: https://issues.apache.org/jira/browse/HDFS-1233 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.20.1

[jira] Created: (HDFS-1232) Corrupted block if a crash happens before writing to checksumOut but after writing to dataOut

2010-06-16 Thread Thanh Do (JIRA)
Corrupted block if a crash happens before writing to checksumOut but after writing to dataOut - Key: HDFS-1232 URL: https://issues.apache.org/jira/browse/HDFS-1232

[jira] Created: (HDFS-1234) Datanode 'alive' but with its disk failed, Namenode thinks it's alive

2010-06-16 Thread Thanh Do (JIRA)
Datanode 'alive' but with its disk failed, Namenode thinks it's alive - Key: HDFS-1234 URL: https://issues.apache.org/jira/browse/HDFS-1234 Project: Hadoop HDFS Issue Type:

[jira] Created: (HDFS-1235) Namenode returning the same Datanode to client, due to infrequent heartbeat

2010-06-16 Thread Thanh Do (JIRA)
Namenode returning the same Datanode to client, due to infrequent heartbeat --- Key: HDFS-1235 URL: https://issues.apache.org/jira/browse/HDFS-1235 Project: Hadoop HDFS

[jira] Created: (HDFS-1236) Client uselessly retries recoverBlock 5 times

2010-06-16 Thread Thanh Do (JIRA)
Client uselessly retries recoverBlock 5 times - Key: HDFS-1236 URL: https://issues.apache.org/jira/browse/HDFS-1236 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1