Re: The NCDC Weather Data for Hadoop the Definitive Guide

2012-02-12 Thread Andy Doddington
According to Page 15 of the book, this data is available from the US National Climatic Data Center, at http://www.ncdc.noaa.gov. Once you get to this site, there is a menu of links on the left-hand side of the page, listed under the heading ‘Data Products’. I suspect that the entry labelled

Re: The NCDC Weather Data for Hadoop the Definitive Guide

2012-02-12 Thread Bing Li
Andy, Since there is a lot of data on the free data of the site, I cannot figure out which one is the one talked in the book. Any format differences might cause the source code to get exceptions. Some data is even in PDF format! Thanks so much! Bing On Sun, Feb 12, 2012 at 4:35 PM, Andy

Re: Error in Formatting NameNode

2012-02-12 Thread Manish Maheshwari
Thanks, I tried with hadoop-1.0.0 and JRE6 and things are looking good. I was able to format the namenode and bring up the NameNode 'calvin-PC:47110' and Hadoop Map/Reduce Administration webpages. Further i tried the example of TestDFSIO but get the below error of connection refused. -bash-4.1$

Re: Error in Formatting NameNode

2012-02-12 Thread Raj Vishwanathan
Manish If you read the error message, it says connection refused. Big clue :-) You probably have firewall configured. Raj Sent from my iPad Please excuse the typos. On Feb 12, 2012, at 1:41 AM, Manish Maheshwari mylogi...@gmail.com wrote: Thanks, I tried with hadoop-1.0.0 and JRE6 and

Re: Processing small xml files

2012-02-12 Thread W.P. McNeill
I've used the Mahout XMLInputFormat. It is the right tool if you have an XML file with one type of section repeated over and over again and want to turn that into Sequence file where each repeated section is a value. I've found it helpful as a preprocessing step for converting raw XML input into

Re: Processing small xml files

2012-02-12 Thread Mohit Anchlia
On Sun, Feb 12, 2012 at 9:24 AM, W.P. McNeill bill...@gmail.com wrote: I've used the Mahout XMLInputFormat. It is the right tool if you have an XML file with one type of section repeated over and over again and want to turn that into Sequence file where each repeated section is a value. I've