[ 
https://issues.apache.org/jira/browse/FLUME-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294756#comment-13294756
 ] 

Hari Shreedharan commented on FLUME-1275:
-----------------------------------------

Patrick:

Thanks for the patch! Overall looks good. I have some minor suggestions though:

* There is a final variable: ENCODING, which is used only for String.getBytes() 
calls. Replacing this with String.getBytes(Charsets.UTF_8) gives the same 
effect and does not thrown UnsupportedEncodingException. Also, do you want to 
make the encoding configurable - though not a common use case, it might be ok 
to consider - though I don't really mind just supporting UTF_8 alone.

* Please document the configuration.

* Using the SimpleRowKeyGenerators's timestamp key is not exactly a good idea, 
or at least, it should be configurable. In the same millisecond, that loop 
could run several times - creating several Puts with the same row key. I don't 
really have a good solution for this, other than creating an interface for row 
key generators and using a configuration provided implementation of the 
interface, to further make it pluggable. 
                
> Add Regex Serializer for HBaseSink
> ----------------------------------
>
>                 Key: FLUME-1275
>                 URL: https://issues.apache.org/jira/browse/FLUME-1275
>             Project: Flume
>          Issue Type: Improvement
>            Reporter: Patrick Wendell
>         Attachments: FLUME-1275.patch.v1.txt
>
>
> It would be nice to have an "out of the box" HBase serializer that can 
> extract column data from a regular expression. This is a feature in Hive and 
> it is widely used:
> https://issues.apache.org/jira/browse/HIVE-167

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to