[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200634#comment-14200634 ]
Patrick Wendell commented on SPARK-2447: ---------------------------------------- Hey All, I have a question about this - is there any reason this can't exist as a user library instead of being merged into Spark itself? For these utility libraries like this, I could see ones coming for Cassandra, Mongo, etc... I don't see it scaling to put and maintain all of these in the Spark code base. At the same time however, they are super useful. As an alternative - what about if it was in HBase similar to e.g. the Hadoop InputFormat implementation? > Add common solution for sending upsert actions to HBase (put, deletes, and > increment) > ------------------------------------------------------------------------------------- > > Key: SPARK-2447 > URL: https://issues.apache.org/jira/browse/SPARK-2447 > Project: Spark > Issue Type: New Feature > Components: Spark Core, Streaming > Reporter: Ted Malaska > Assignee: Ted Malaska > > Going to review the design with Tdas today. > But first thoughts is to have an extension of VoidFunction that handles the > connection to HBase and allows for options such as turning auto flush off for > higher through put. > Need to answer the following questions first. > - Can it be written in Java or should it be written in Scala? > - What is the best way to add the HBase dependency? (will review how Flume > does this as the first option) > - What is the best way to do testing? (will review how Flume does this as the > first option) > - How to support python? (python may be a different Jira it is unknown at > this time) > Goals: > - Simple to use > - Stable > - Supports high load > - Documented (May be in a separate Jira need to ask Tdas) > - Supports Java, Scala, and hopefully Python > - Supports Streaming and normal Spark -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org