Hi, I have used and followed the same steps which is mentioned in below mails to get start with the hbasesink. But agent is not storing any data into hbase. I added the hbase-site.xml in $CLASSPATH variable to pick the hbase information. Even I am able to connect to the hbase server from that agent machine.
Now, I am unable to understand and troubleshoot this problem. Seeking advice from the community members... ---------------------------------------- ---------------------------------------- Thanks & Regards, Ashutosh Sharma ---------------------------------------- -----Original Message----- From: Mohammad Tariq [mailto:donta...@gmail.com] Sent: Friday, June 15, 2012 9:02 AM To: flume-user@incubator.apache.org Subject: Re: Hbase-sink behavior Thank you so much Hari for the valuable response..I'll follow the guidelines provided by you. Regards, Mohammad Tariq On Fri, Jun 15, 2012 at 5:26 AM, Hari Shreedharan <hshreedha...@cloudera.com> wrote: > Hi Mohammad, > > My answers are inline. > > -- > Hari Shreedharan > > On Thursday, June 14, 2012 at 4:47 PM, Mohammad Tariq wrote: > > Hello list, > > I am trying to use hbase-sink to collect data from a local file and > dump it into an Hbase table..But there are a few things I am not able > to understand and need some guidance. > > This is the content of my conf file : > > hbase-agent.sources = tail > hbase-agent.sinks = sink1 > hbase-agent.channels = ch1 > hbase-agent.sources.tail.type = exec > hbase-agent.sources.tail.command = tail -F /home/mohammad/demo.txt > hbase-agent.sources.tail.channels = ch1 hbase-agent.sinks.sink1.type = > org.apache.flume.sink.hbase.HBaseSink > hbase-agent.sinks.sink1.channel = ch1 > hbase-agent.sinks.sink1.table = test3 > hbase-agent.sinks.sink1.columnFamily = testing > hbase-agent.sinks.sink1.column = foo > hbase-agent.sinks.sink1.serializer = > org.apache.flume.sink.hbase.SimpleHbaseEventSerializer > hbase-agent.sinks.sink1.serializer.payloadColumn = col1 > hbase-agent.sinks.sink1.serializer.incrementColumn = col1 > hbase-agent.sinks.sink1.serializer.keyType = timestamp > hbase-agent.sinks.sink1.serializer.rowPrefix = 1 > hbase-agent.sinks.sink1.serializer.suffix = timestamp > hbase-agent.channels.ch1.type=memory > > Right now I am taking just some simple text from a file which has > following content - > > value1 > value2 > value3 > value4 > value5 > value6 > > And my Hbase table looks like - > > hbase(main):217:0> scan 'test3' > ROW COLUMN+CELL > 11339716704561 column=testing:col1, > timestamp=1339716707569, value=value1 > 11339716704562 column=testing:col1, > timestamp=1339716707571, value=value4 > 11339716846594 column=testing:col1, > timestamp=1339716849608, value=value2 > 11339716846595 column=testing:col1, > timestamp=1339716849610, value=value1 > 11339716846596 column=testing:col1, > timestamp=1339716849611, value=value6 > 11339716846597 column=testing:col1, > timestamp=1339716849614, value=value6 > 11339716846598 column=testing:col1, > timestamp=1339716849615, value=value5 > 11339716846599 column=testing:col1, > timestamp=1339716849615, value=value6 > incRow column=testing:col1, > timestamp=1339716849677, value=\x00\x00\x00\x00\x00\x00\x00\x1C > 9 row(s) in 0.0580 seconds > > Now I have following questions - > > 1- Why the timestamp value is different from the row key?(I was trying > to make "1+timestamp" as the rowkey) > > The value shown by hbase shell as timestamp is the time at which the > value was inserted into Hbase, while the value inserted by Flume is > the timestamp at which the sink read the event from the channel. > Depending on how long the network and HBase takes, these timestamps > can vary. If you want 1+timestamp as row key then you should configure it: > > hbase-agent.sinks.sink1.serializer.rowPrefix = 1+ This prefix is > appended as-is to the suffix you choose. > > 2- Although I am not using "incRow", it stills appear in the table > with some value. Why so and what is this value?? > > The SimpleHBaseEventSerializer is only an example class. For custom > use cases you can write your own serializer by implementing > HbaseEventSerializer. In this case, you have specified > incrementColumn, which causes an increment on the column specified. > Simply don't specify that config and that row will not appear. > > 3- How can avoid the last row?? > > See above. > > > I am still in the learning phase so please pardon my ignorance..Many thanks. > > No problem. Much of this is documented > here: > https://builds.apache.org/job/flume-trunk/site/apidocs/index.html > > > > Regards, > Mohammad Tariq > > 이 메일은 지정된 수취인만을 위해 작성되었으며, 중요한 정보나 저작권을 포함하고 있을 수 있습니다. 어떠한 권한 없이, 본 문서에 포함된 정보의 전부 또는 일부를 무단으로 제3자에게 공개, 배포, 복사 또는 사용하는 것을 엄격히 금지합니다. 만약, 본 메일이 잘못 전송된 경우, 발신인 또는 당사에 알려주시고, 본 메일을 즉시 삭제하여 주시기 바랍니다. This E-mail may contain confidential information and/or copyright material. This email is intended for the use of the addressee only. If you receive this email by mistake, please either delete it without reproducing, distributing or retaining copies thereof or notify the sender immediately.