Thank you both. I am going to try it with both as POC. Bhaskar
On Wed, Jun 13, 2012 at 4:28 PM, Mohammad Tariq <donta...@gmail.com> wrote: > As said by Ralph, Hdfs or Local fs are not the only options..you can > dump your data into any store like Cassandra, Hbase etc etc > > Regards, > Mohammad Tariq > > > On Thu, Jun 14, 2012 at 1:54 AM, Ralph Goers <ralph.go...@dslextreme.com> > wrote: > > FWIW, Hadoop is not the only target repository. We are using Cassandra. > > > > Ralph > > > > On Jun 13, 2012, at 1:19 PM, Mohammad Tariq wrote: > > > >> it's absolutely ok for initial learning, but not feasible for > >> production or if you want to evaluate hadoop ecosystem properly..I > >> would suggest to setup a hadoop cluster in pseudo mode on your pc > >> first..or if you do not require hadoopat all then there is no problem. > >> > >> Regards, > >> Mohammad Tariq > >> > >> > >> On Thu, Jun 14, 2012 at 1:08 AM, Bhaskar <bmar...@gmail.com> wrote: > >>> Thank you Mohammad for prompt response. I built it from source and > have > >>> tried few combinations so far. I do not have HDFS set up yet. I was > trying > >>> to use local file system as sink. Is that a feasible option? > >>> > >>> Bhaskar > >>> > >>> > >>> On Wed, Jun 13, 2012 at 1:45 PM, Mohammad Tariq <donta...@gmail.com> > wrote: > >>>> > >>>> Hello Bhaskar, > >>>> > >>>> The very first step would be to build flume-ng from > >>>> trunk..you can use following commands to do that - > >>>> > >>>> $ svn co https://svn.apache.org/repos/asf/incubator/flume/trunk flume > >>>> $ cd flume > >>>> $ mvn3 install -DskipTests > >>>> > >>>> I would suggest to use Maven 3.0.3, as you may find some problems with > >>>> Maven2..Once you are done with your build, you need to write the > >>>> configuration files for your agents..For example, to collect apache > >>>> web server logs into the Hdfs, the basic configuration would be > >>>> something like this - > >>>> > >>>> agent1.sources = tail > >>>> agent1.channels = MemoryChannel-2 > >>>> agent1.sinks = HDFS > >>>> > >>>> agent1.sources.tail.type = exec > >>>> agent1.sources.tail.command = tail -F /var/log/apache2/access.log > >>>> agent1.sources.tail.channels = MemoryChannel-2 > >>>> > >>>> agent1.sinks.HDFS.channel = MemoryChannel-2 > >>>> agent1.sinks.HDFS.type = hdfs > >>>> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000/flume > >>>> agent1.sinks.HDFS.hdfs.file.Type = DataStream > >>>> > >>>> agent1.channels.MemoryChannel-2.type = memory > >>>> > >>>> Save this file as agent1.conf inside your flume-ng/conf directory and > >>>> start your agent using - > >>>> > >>>> $ bin/flume-ng agent -n agent1 -f conf/agent1.conf > >>>> > >>>> > >>>> Regards, > >>>> Mohammad Tariq > >>>> > >>>> > >>>> On Wed, Jun 13, 2012 at 10:45 PM, Bhaskar <bmar...@gmail.com> wrote: > >>>>> Good Afternoon, > >>>>> I am a newbee to flume and read thru limited documentation > available. I > >>>>> would like to set up the following to test out. > >>>>> > >>>>> 1. Read apache access logs (as source) > >>>>> 2. Use memory channel > >>>>> 3. Write it to a NFS (or even local) file system > >>>>> > >>>>> Can some one help me with the necessary configuration. I am having > >>>>> difficult time to glean that information from available > documentation. > >>>>> I am > >>>>> sure someone has done such test before and i appreciate if you can > pass > >>>>> on > >>>>> that information. Secondly, I also would like to stream the logs to > a > >>>>> remote server. Is that a log4j configuration or do i need to run an > >>>>> agent > >>>>> on each host to do so? Any configuration examples would be of great > >>>>> help. > >>>>> > >>>>> Thanks, > >>>>> Bhaskar > >>> > >>> > > >