RE: FYI, Large-scale graph computing at Google

2009-06-25 Thread Patterson, Josh
Steve, I'm a little lost here; Is this a replacement for M/R or is it some new code that sits ontop of M/R that runs an iteration over some sort of graph's vertexes? My quick scan of Google's article didn't seem to yeild a distinction. Either way, I'd say for our data that a graph processing lib fo

RE: Hadoop UI beta

2009-04-22 Thread Patterson, Josh
Stefan, Thanks for contributing this, this is very nice. We may and try and use the Hadoop-ui (web server part) as a XML data source to feed a web app showing user's the state of their jobs as this seems like a good simple webserver to customize for pulling job info to another server or via AJAX. T

RE: Hadoop and Matlab

2009-04-21 Thread Patterson, Josh
Sameer, I'd also be interested in that as well; We are constructing a hadoop cluster for energy data (PMU) for the NERC and we will be potentially running jobs for a number of groups and researchers. I know some researchers will know nothing of map reduce, yet are very keen on MatLab, so we're look

Map Rendering

2009-04-13 Thread Patterson, Josh
We're looking into power grid visualization and were wondering if anyone could recommend a good java native lib (that plays nice with hadoop) to render some layers of geospatial data. At this point we have the cluster crunching our test data, formats, and data structures, and we're now looking at p

RE: Small Test Data Sets

2009-03-25 Thread Patterson, Josh
[mailto:enis@gmail.com] Sent: Wednesday, March 25, 2009 5:27 AM To: core-user@hadoop.apache.org Subject: Re: Small Test Data Sets Patterson, Josh wrote: > I want to confirm something with the list that I'm seeing; > > I needed to confirm that my Reader was reading our file format >

Small Test Data Sets

2009-03-24 Thread Patterson, Josh
I want to confirm something with the list that I'm seeing; I needed to confirm that my Reader was reading our file format correctly, so I created a MR job that simply output each K/V pair to the reducer, which then just wrote out each one to the output file. This allows me to check by hand that a

RE: RecordReader design heuristic

2009-03-18 Thread Patterson, Josh
mples include line-oriented text (split at newlines), and bzip2 (has a unique block marker). If your format is splittable then you will be able to take advantage of this to make MR processing more efficient. Cheers, Tom On Wed, Mar 18, 2009 at 5:00 PM, Patterson, Josh wrote: > Jeff, > Yeah, t

RE: RecordReader design heuristic

2009-03-18 Thread Patterson, Josh
pply these algorithms if you have an interest. Jeff Patterson, Josh wrote: > Jeff, > ok, that makes more sense, I was under the mis-impression that it was creating and destroying mappers for each input record. I dont know why I had that in my head. My design suddenly became a lot clearer, and

RE: RecordReader design heuristic

2009-03-17 Thread Patterson, Josh
er if you want but you already will have it in the reader so why bother? Jeff Patterson, Josh wrote: > Jeff, > So if I'm hearing you right, its "good" to send one point of data (10 > bytes here) to a single mapper? This mind set increases the number of > mappers, but

RE: RecordReader design heuristic

2009-03-17 Thread Patterson, Josh
he record reader will be iterating over a block of data to provide mapper inputs. IIRC, splits will generally be an HDFS block or less, so if you have files smaller than that you will get one mapper per. For larger files you can get up to one mapper per split block. Jeff Patterson, Josh w

RecordReader design heuristic

2009-03-17 Thread Patterson, Josh
I am currently working on a RecordReader to read a custom time series data binary file format and was wondering about ways to be most efficient in designing the InputFormat/RecordReader process. Reading through: http://wiki.apache.org/hadoop/HadoopMapReduce

RE: Issues installing FUSE_DFS

2009-03-03 Thread Patterson, Josh
AM To: core-user@hadoop.apache.org Subject: Re: Issues installing FUSE_DFS On Mar 3, 2009, at 10:01 AM, Patterson, Josh wrote: > Hey Brian, > I'm working with Matthew on our hdfs install, and he's doing the > server > admin on this project; We just tried the settings you sugg

RE: Issues installing FUSE_DFS

2009-03-03 Thread Patterson, Josh
Hey Brian, I'm working with Matthew on our hdfs install, and he's doing the server admin on this project; We just tried the settings you suggested, and we got the following error: [r...@socdvmhdfs1 ~]# fuse_dfs -oserver=socdvmhdfs1 -oport=9000 /hdfs -oallow_ot her -ordbuffer=131072 fuse-dfs did