[
https://issues.apache.org/jira/browse/CHUKWA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated CHUKWA-734:
----------------------------------------
Attachment: CHUKWA-734.patch
We just released Apache Gora 0.6 today so i thought I would put this together
with the aim of building upon the initial patch.
Initial patch which contains
* implementing the GoraWriter, I've added as much documentation as I see it
* building a Gora implementation of the Chukwa Chunk e.g data and metadata
* implementation of an HBase mapping (gora-hbase-mapping.xml)
* addition of gora.properties file
* definition of gora-hbase dependency as well as the required gora-hadoop-X
dependencies within pom.xml
What you need to do to get it working
* uncomment the gora-hbase dependency within pom.xml
* use GoraWriter as the writer ikplementation within agent-conf (please see
patch) for addition to this file
* mvn install
What I would like from you guys
* try giving it a spin and see if you can use it... if you can't then I would
very much appreciate the feedback.
Some notes
* HBase support in Gora 0.6 is 0.98.8-hadoop2
* Hadoop support is 1.2.1 and 2.5.2
* We use Avro for serialization, hence everything will be in HBase as Avro
serialized data.
Some things [~eyang] and myself still need to sort out
* What does primary key look like?
Next steps
* I get feedback on this
* I think about primary key support
* I write some tests using Gora's MemStore to simulate mapping Chukwa chunk
data to a Gora datastore.
> Gora Storage System for Chuckwa Logs
> ------------------------------------
>
> Key: CHUKWA-734
> URL: https://issues.apache.org/jira/browse/CHUKWA-734
> Project: Chukwa
> Issue Type: New Feature
> Components: Data Collection
> Affects Versions: 0.6.0
> Reporter: Lewis John McGibbney
> Fix For: 0.6.0
>
> Attachments: CHUKWA-734.patch
>
> Original Estimate: 5h
> Remaining Estimate: 5h
>
> I would like to build a Gora-backed log-to-datastore module for Chuckwa. I am
> going to work on this today.
> Gora is an in-memory data modeling and storage abstraction
> http://gora.apache.org
> Gora powers the Apache Nutch 2.X software which generates a bunch of log
> data. Having a Chuckwa monitoring tool for Nutch would be grand.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)