I think we can start by implementing a tighter integration with Spark through 
DataSource V2.That would make it quickly apparent what parts of Phoenix would 
need direct access.
Some parts just need a interface audience declaration (like Phoenix's basic 
type system) and our agreement that we will change those only according to 
semantic versioning. Otherwise (like the query plan) will need a bit more 
thinking. Maybe that's the path to hook Calcite - just making that part up as I 
write this...
Perhaps turning the HBase interface into an API might not be so difficult 
either. That would perhaps be a new client - strictly additional - client API.

A good Spark interface is in everybody's interest and I think is the best 
avenue to figure out what's missing/needed.
-- Lars

    On Wednesday, September 12, 2018, 12:47:21 PM PDT, Josh Elser 
<els...@apache.org> wrote:  
 
 I like it, Lars. I like it very much.

Just the easy part of doing it... ;)

On 9/11/18 4:53 PM, la...@apache.org wrote:
>  Sorry for coming a bit late to this. I've been thinking about some of lines 
>for a bit.
> It seems Phoenix serves 4 distinct purposes:
> 1. Query parsing and compiling.2. A type system3. Query execution4. Efficient 
> HBase interface
> Each of these is useful by itself, but we do not expose these as stable 
> interfaces.We have seen a lot of need to tie HBase into "higher level" 
> service, such as Spark (and Presto, etc).
> I think we can get a long way if we separate at least #1 (SQL) from the rest 
> #2, #3, and #4 (Typed HBase Interface - THI).
> Phoenix is used via SQL (#1), other tools such as Presto, Impala, Drill, 
> Spark, etc, can interface efficiently with HBase via THI (#2, #3, and #4).
> Thoughts?
> -- Lars
>      On Monday, August 27, 2018, 11:03:33 AM PDT, Josh Elser 
><els...@apache.org> wrote:
>  
>  (bcc: dev@hbase, in case folks there have been waiting for me to send
> this email to dev@phoenix)
> 
> Hi,
> 
> In case you missed it, there was an HBaseCon event held in Asia
> recently. Stack took some great notes and shared them with the HBase
> community. A few of them touched on Phoenix, directly or in a related
> manner. I think they are good "criticisms" that are beneficial for us to
> hear.
> 
> 1. The phoenix-$version-client.jar size is prohibitively large
> 
> In this day and age, I'm surprised that this is a big issue for people.
> I know have a lot of cruft, most of which coming from hadoop. We have
> gotten better here over recent releases, but I would guess that there is
> more we can do.
> 
> 2. Can Phoenix be the de-facto schema for SQL on HBase?
> 
> We've long asserted "if you have to ask how Phoenix serializes data, you
> shouldn't be do it" (a nod that you have to write lots of code). What if
> we turn that on its head? Could we extract our PDataType serialization,
> composite row-key, column encoding, etc into a minimal API that folks
> with their own itches can use?
> 
> With the growing integrations into Phoenix, we could embrace them by
> providing an API to make what they're doing easier. In the same vein, we
> cement ourselves as a cornerstone of doing it "correctly".
> 
> 3. Better recommendations to users to not attempt certain queries.
> 
> We definitively know that there are certain types of queries that
> Phoenix cannot support well (compared to optimal Phoenix use-cases).
> Users very commonly fall into such pitfalls on their own and this leaves
> a bad taste in their mouth (thinking that the product "stinks").
> 
> Can we do a better job of telling the user when and why it happened?
> What would such a user-interaction model look like? Can we supplement
> the "why" with instructions of what to do differently (even if in the
> abstract)?
> 
> 4. Phoenix-Calcite
> 
> This was mentioned as a "nice to have". From what I understand, there
> was nothing explicitly from with the implementation or approach, just
> that it was a massive undertaking to continue with little immediate
> gain. Would this be a boon for us to try to continue in some form? Are
> there steps we can take that would help push us along the right path?
> 
> Anyways, I'd love to hear everyone's thoughts. While the concerns were
> raised at HBaseCon Asia, the suggestions that accompany them here are
> largely mine ;). Feel free to break them out into their own threads if
> you think that would be better (or say that you disagree with me --
> that's cool too)!
> 
> - Josh
>    
> 
  

Reply via email to