Ted Malaska created SPARK-2447:
----------------------------------

             Summary: Add common solution for sending upsert actions to HBase 
(put, deletes, and increment)
                 Key: SPARK-2447
                 URL: https://issues.apache.org/jira/browse/SPARK-2447
             Project: Spark
          Issue Type: New Feature
            Reporter: Ted Malaska


Going to review the design with Tdas today.  

But first thoughts is to have an extension of VoidFunction that handles the 
connection to HBase and allows for options such as turning auto flush off for 
higher through put.

Need to answer the following questions first.
- Can it be written in Java or should it be written in Scala?
- What is the best way to add the HBase dependency? (will review how Flume does 
this as the first option)
- What is the best way to do testing? (will review how Flume does this as the 
first option)
- How to support python? (python may be a different Jira it is unknown at this 
time)

Goals:
- Simple to use
- Stable
- Supports high load
- Documented (May be in a separate Jira need to ask Tdas)
- Supports Java, Scala, and hopefully Python
- Supports Streaming and normal Spark



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to