Re: HDFS - millions of files in one directory?

2009-01-23 Thread Mark V
On Sat, Jan 24, 2009 at 10:03 AM, Mark Kerzner wrote: > Hi, > > there is a performance penalty in Windows (pardon the expression) if you put > too many files in the same directory. The OS becomes very slow, stops seeing > them, and lies about their status to my Java requests. I do not know if this

CanonicalInputFormat use case: Hadoop + MonetDB serving localhost

2008-10-16 Thread Mark V
Hi Group, This is the second of two emails, each raising a related idea. The first, canvasses an InputFormat that allows a data-chunk to be targeted to a MR job/cluster node, and is here: http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200810.mbox/[EMAIL PROTECTED] As an aside, I notice

Enhancement idea: Additional InputType

2008-10-15 Thread Mark V
Hi Group, Thanks for the effort placed into making Hadoop available, and for the tips and suggestions posted to this mailing list. I'm currently looking into using Hadoop and, along the way, writing a Ruby DSL for Cascading (www.cascading.org). Some homework has generated two related ideas. I'll