I suspect that the reason no-one is responding with good answers is that
fundamentally, it seems like what you are trying to do runs against the
reason Hadoop is designed the way it is. A parallel process framework is
defeated if you force it to not work concurrently...
Maybe you should look into
It was my understanding that HortonWorks depended on CygWin (UNIX emulation
on Windows) for most of the Bigtop family of tools - Hadoop core,
MapReduce, etc. - so, you will probably make all your configuration files
in Windows, since XML is agnostic, and can develop in Windows, since JARs
and
So, perhaps this has been thought of, but perhaps not.
It is my understanding that grep is usually sorting things one line at a
time. As I am currently experimenting with Avro, I am finding that the
local grep function does not handle it well at all, because it is one long
line essentially, so
temporarly had set the permissions to 777 to see if something changes,
but it didn't ... I checked only the jobtracker, are the other nodes
important for this as well?
thx already in advance, especially for the quick response!
Wolli
2013/10/11 DSuiter RDX dsui...@rdx.com
The user running
Sagar,
It sounds like you want a management console. We are using Cloudera
Manager, but for 200 nodes you would need to license it, it is only free up
to 50 nodes.
The FOSS version of this is Ambari, iirc.
http://incubator.apache.org/ambari/
Flume will provide a Hadoop-integrated pipeline for
Hi,
We are working on building a MapReduce program that takes Avro input from
HDFS, gets the timestamp, and counts the number of events written in any
given day. We would like to make a program that does not need to have the
Avro data declared previously, rather, it would be best if it could read