Hi

I would want to develop a simulator (or log data generator) using Hadoop 
modules.

The simulator includes a lot of parallel (time synchronized) state machines 
(even million) and each of them generates log file (with timestamps). State 
machines also use common data structures (like database of key-value-pairs).

Finally all events (rows) of all log files should combine as time order to 
(one) very huge log file. Practically the combined huge log file can also be 
split into smaller ones.

Is that good or bad idea ?

If it is possible, how to do that using by Hadoop modules (or other 
modules/libraries/tools) ?

I would also interested how to produce real time event stream (instead of log 
files) by simulator.


Reply via email to