All, I have run Flume agents on a pusedo-distributed VM from Cloudera ingesting tweets from twitter. When I paste the same configuratons into the Flume section of Ambari I do not get any data from twitter. The screen in Ambari says the agents are running but when I go to the directory, I see no files:
[root@namenode PBX]# hadoop fs -ls /user/flume/tweets [root@namenode PBX]# hadoop fs -ls /user/flume/tweets [root@namenode PBX]# hadoop fs -ls /user/flume/tweets/ [root@namenode PBX]# I have attached the cluster parameters in a PDF. Here is the URL I am using to add the configuration to the Flume agents: http://namenode.localdomain.com:8080/#/main/services/FLUME/configs Here is the configuration for the twitter agent: # defining the source for the agent for Twitter TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource TwitterAgent.sources.Twitter.channels = MemoryChannel TwitterAgent.sources.Twitter.consumerKey = (just removing for security) TwitterAgent.sources.Twitter.accessToken = (removing) TwitterAgent.sources.Twitter.accessTokenSecret =(removing) TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientist, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing TwitterAgent.sources.Twitter.maxBatchSize = 10 TwitterAgent.sources.Twitter.maxBatchDurationMillis = 200 # defining the interceptors TwitterAgent.sources.Twitter.interceptors = i1 TwitterAgent.sources.Twitter.interceptors.i1.type = timestamp # defining the sink for the agent TwitterAgent.sinks.HDFS.channel = MemoryChannel TwitterAgent.sinks.HDFS.type = hdfs TwitterAgent.sinks.HDFS.hdfs.path = /user/flume/tweets/%Y/%m/%d TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000 TwitterAgent.sinks.HDFS.hdfs.rollSize = 0 TwitterAgent.sinks.HDFS.hdfs.rollCount = 100000 TwitterAgent.sinks.HDFS.hdfs.rollInterval = 6000 TwitterAgent.sinks.HDFS.hdfs.filePrefix = events- # definning the channel for the agent TwitterAgent.channels.MemoryChannel.type = memory TwitterAgent.channels.MemoryChannel.capacity = 10000 TwitterAgent.channels.MemoryChannel.transactionCapacity = 10000 David Novogrodsky david.novogrod...@gmail.com http://www.linkedin.com/in/davidnovogrodsky
aMBARIsETuP.pdf
Description: Adobe PDF document