Ted,
I am not aware of anything within HBase itself that would be affected by
malicious character strings, although there is a significant probability
that an unpublicized vulnerability exists.
However, you have to consider which API you are using as well: in the
Thrift API the Key, Qualifier, and Value are defined as Text, whereas
the Java API uses arbitrary byte arrays. This creates the potential
for bugs (including security bugs) when you are using both APIs. For
instance, Unicode conversion has been used as an attack vector.
However, even if HBase itself has no issues, keep in mind that it does
not exist in a vacuum and there will be other components in the solution
stack that may be vulnerable. HBase will happily store and serve up a
piece of malicious Javascript labeled as "FavoritePet". The entire
application is only as secure as the weakest link.
Bottom line: User-supplied values must ALWAYS be checked and sanitized
and can NEVER be blindly trusted. Just because your application isn't
vulnerable to a specific attack vector (like SQL injection, cross site
scripting, or shell escape attacks) doesn't mean that an application
which consumes data from your system is going to share that immunity.
Practice defense in depth.
On 9/30/2014 5:19 PM, Ted wrote:
Hi I'm wondering if it's safe to use user inputed values as column qualifiers.
I realised there maybe a sensible size limit, but that's easily checked.
The scenario is if you wanted to store simple key/value pairs into
column/values like perhaps some ones preferences like :
FavouriteColour=Red
FavouritePet=Cat
where the user may get to choose both the key and value.
Basically the concern is special characters and or special parsing of
the column names, as an example the column names are allegedly =
<family_name> : <column_qualifier>
so what happens if people put more colons in the qualifier and or
escape characters like backspace or other control characters etc? Is
there any danger or is it all just uninterpreted bytes values after
the first colon?
thanks