[jira] Commented: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread Runping Qi (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-153?page=comments#action_12375457 ] Runping Qi commented on HADOOP-153: --- +1 Exceptions in the map and reduce functions that are implemented by the user should be handled by the user within the functions. In

[jira] Commented: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread eric baldeschwieler (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-153?page=comments#action_12375455 ] eric baldeschwieler commented on HADOOP-153: sounds good. The acceptable % should probably be configurable. I'd be inclined to use something more like 1%. You c

Re: [jira] Commented: (HADOOP-141) Disk thrashing / task timeouts during map output copy phase

2006-04-20 Thread Eric Baldeschwieler
humm, The client is timing out when it is getting data? Maybe as long as it is getting data, it should reset its timer? Maybe the server should fail a client if it is busy? This would let you make informed decision. On Apr 20, 2006, at 11:24 AM, paul sutter (JIRA) wrote: [ http://

Re: nutch user meeting in San Francisco: May 18th

2006-04-20 Thread Doug Cutting
Folks can say whether they'll attend at: http://www.evite.com/app/publicUrl/[EMAIL PROTECTED]/nutch-1 Doug

nutch user meeting in San Francisco: May 18th

2006-04-20 Thread Stefan Groschupf
(with apologies for multiple postings) Dear Nutch users, Dear Nutch developers, Dear Hadoop developers, we would love to invite you to the Nutch user meeting in San Francisco. Date: Thursday, May 18th, 2006 Time: 7 PM. Location: Cafe Du Soleil, 200 Fillmore St, San Francisco, CA 94117. (Th

[jira] Updated: (HADOOP-132) An API for reporting performance metrics

2006-04-20 Thread Sameer Paranjpye (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-132?page=all ] Sameer Paranjpye updated HADOOP-132: Fix Version: 0.2 Version: 0.2 Description: I'd like to propose adding an API for reporting performance metrics. I will post some javadoc a

[jira] Commented: (HADOOP-132) An API for reporting performance metrics

2006-04-20 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-132?page=comments#action_12375438 ] Doug Cutting commented on HADOOP-132: - +1 Looks good to me! I think there's a typo in the overview, where you should have setMetric() you instead have setGauge(). > An

[jira] Commented: (HADOOP-108) EOFException in DataNode$DataXceiver.run

2006-04-20 Thread Igor Bolotin (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-108?page=comments#action_12375436 ] Igor Bolotin commented on HADOOP-108: - This problem didn't happen to us anymore after upgrading Hadoop. Also, based on the description - this one looks like duplicate of H

[jira] Created: (HADOOP-156) Reducer threw IOEOFException

2006-04-20 Thread Runping Qi (JIRA)
Reducer threw IOEOFException - Key: HADOOP-156 URL: http://issues.apache.org/jira/browse/HADOOP-156 Project: Hadoop Type: Bug Reporter: Runping Qi A job was running with all the map tasks completed. The reducers were appending the

[jira] Created: (HADOOP-155) Add a conf dir parameter to the scripts

2006-04-20 Thread Owen O'Malley (JIRA)
Add a conf dir parameter to the scripts --- Key: HADOOP-155 URL: http://issues.apache.org/jira/browse/HADOOP-155 Project: Hadoop Type: Improvement Components: conf Reporter: Owen O'Malley We'd like a conf_dir parameter o

[jira] Created: (HADOOP-154) fsck fails when there is no file in dfs

2006-04-20 Thread Lei Chen (JIRA)
fsck fails when there is no file in dfs --- Key: HADOOP-154 URL: http://issues.apache.org/jira/browse/HADOOP-154 Project: Hadoop Type: Bug Components: dfs Versions: 0.1.1 Reporter: Lei Chen Priority: Trivial

[jira] Updated: (HADOOP-132) An API for reporting performance metrics

2006-04-20 Thread David Bowen (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-132?page=all ] David Bowen updated HADOOP-132: --- Attachment: javadoc.tgz Here is an updated API, incorporating the feedback I've received so far. The main changes are (1) Ganglia support is included - thi

unsorted keys in Map File

2006-04-20 Thread Stefan Groschupf
Hi hadoop developers, I'm looking for a hint or inspiration for a problem I would love to solve with the hadoop platform but it is not map reduce related. My data structure is builded from rows and each row has a set of columns and column values. For example row key: cnn.com column keys: user

[jira] Updated: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-153?page=all ] Doug Cutting updated HADOOP-153: Fix Version: 0.2 > skip records that throw exceptions > -- > > Key: HADOOP-153 > URL: http://issues.apache.org

[jira] Updated: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-153?page=all ] Doug Cutting updated HADOOP-153: Version: 0.2 Assign To: Doug Cutting > skip records that throw exceptions > -- > > Key: HADOOP-153 > URL

[jira] Commented: (HADOOP-141) Disk thrashing / task timeouts during map output copy phase

2006-04-20 Thread paul sutter (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-141?page=comments#action_12375411 ] paul sutter commented on HADOOP-141: A few timeouts would be fine. The problem is when the same files timeout over and over again, and progress ceases completely. I was

[jira] Commented: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread [EMAIL PROTECTED] (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-153?page=comments#action_12375408 ] [EMAIL PROTECTED] commented on HADOOP-153: -- +1 This would be a generalization of the checksum handler that tries to skip records when 'io.skip.checksum.errors' is se

[jira] Resolved: (HADOOP-69) Unchecked lookup value causes NPE in FSNamesystemgetDatanodeHints

2006-04-20 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-69?page=all ] Doug Cutting resolved HADOOP-69: Fix Version: 0.2 Resolution: Fixed Committed. Sorry, this fell off my radar. Thanks for the reminder. I fixed something related in: http://svn.apa

[jira] Commented: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread Sameer Paranjpye (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-153?page=comments#action_12375405 ] Sameer Paranjpye commented on HADOOP-153: - +1 This would be a cool feature to have. Perhaps the exceptions should also be made visible at the jobtracker. An extension

[jira] Created: (HADOOP-153) skip records that throw exceptions

2006-04-20 Thread Doug Cutting (JIRA)
skip records that throw exceptions -- Key: HADOOP-153 URL: http://issues.apache.org/jira/browse/HADOOP-153 Project: Hadoop Type: New Feature Components: mapred Reporter: Doug Cutting MapReduce should skip records that throw

[jira] Commented: (HADOOP-69) Unchecked lookup value causes NPE in FSNamesystemgetDatanodeHints

2006-04-20 Thread Bryan Pendleton (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-69?page=comments#action_12375403 ] Bryan Pendleton commented on HADOOP-69: --- Bump: Any reason this patch hasn't been applied? It looks like it's still possible for non-found blocks to return null, causing a

[jira] Created: (HADOOP-152) Speculative tasks not being scheduled

2006-04-20 Thread Bryan Pendleton (JIRA)
Speculative tasks not being scheduled - Key: HADOOP-152 URL: http://issues.apache.org/jira/browse/HADOOP-152 Project: Hadoop Type: Bug Components: mapred Versions: 0.2 Environment: ~30 node Opteron cluster Reporte

[jira] Commented: (HADOOP-141) Disk thrashing / task timeouts during map output copy phase

2006-04-20 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-141?page=comments#action_12375401 ] Doug Cutting commented on HADOOP-141: - Some timeouts during the copy phase may not be bad. If too many nodes are transferring from a given node, then it may time out addi

[jira] Updated: (HADOOP-115) Hadoop should allow the user to use SequentialFileOutputformat as the output format and to choose key/value classes that are different from those for map output.

2006-04-20 Thread Teppo Kurki (JIRA)
[ http://issues.apache.org/jira/browse/HADOOP-115?page=all ] Teppo Kurki updated HADOOP-115: --- Attachment: hadoop-115_ReduceTask.patch Patch including TestReduceTask - generates a bunch of SequenceFiles and reduces them by running a single ReduceTask - tw

Re: IdentityMapper

2006-04-20 Thread Stefan Groschupf
Hi Doug, I don't understand the problem here. There is no really problem, just a question to better understand hadoop. My real problem is that the map and reduce task have to have the same key and value class. Since changing this is a little bit more work as far I can say that, I was thin