Re: [DISCUSS] Suggestions for Phoenix from HBaseCon Asia notes

2018-09-16 Thread la...@apache.org
 I think we can start by implementing a tighter integration with Spark through 
DataSource V2.That would make it quickly apparent what parts of Phoenix would 
need direct access.
Some parts just need a interface audience declaration (like Phoenix's basic 
type system) and our agreement that we will change those only according to 
semantic versioning. Otherwise (like the query plan) will need a bit more 
thinking. Maybe that's the path to hook Calcite - just making that part up as I 
write this...
Perhaps turning the HBase interface into an API might not be so difficult 
either. That would perhaps be a new client - strictly additional - client API.

A good Spark interface is in everybody's interest and I think is the best 
avenue to figure out what's missing/needed.
-- Lars

On Wednesday, September 12, 2018, 12:47:21 PM PDT, Josh Elser 
 wrote:  
 
 I like it, Lars. I like it very much.

Just the easy part of doing it... ;)

On 9/11/18 4:53 PM, la...@apache.org wrote:
>  Sorry for coming a bit late to this. I've been thinking about some of lines 
>for a bit.
> It seems Phoenix serves 4 distinct purposes:
> 1. Query parsing and compiling.2. A type system3. Query execution4. Efficient 
> HBase interface
> Each of these is useful by itself, but we do not expose these as stable 
> interfaces.We have seen a lot of need to tie HBase into "higher level" 
> service, such as Spark (and Presto, etc).
> I think we can get a long way if we separate at least #1 (SQL) from the rest 
> #2, #3, and #4 (Typed HBase Interface - THI).
> Phoenix is used via SQL (#1), other tools such as Presto, Impala, Drill, 
> Spark, etc, can interface efficiently with HBase via THI (#2, #3, and #4).
> Thoughts?
> -- Lars
>      On Monday, August 27, 2018, 11:03:33 AM PDT, Josh Elser 
> wrote:
>  
>  (bcc: dev@hbase, in case folks there have been waiting for me to send
> this email to dev@phoenix)
> 
> Hi,
> 
> In case you missed it, there was an HBaseCon event held in Asia
> recently. Stack took some great notes and shared them with the HBase
> community. A few of them touched on Phoenix, directly or in a related
> manner. I think they are good "criticisms" that are beneficial for us to
> hear.
> 
> 1. The phoenix-$version-client.jar size is prohibitively large
> 
> In this day and age, I'm surprised that this is a big issue for people.
> I know have a lot of cruft, most of which coming from hadoop. We have
> gotten better here over recent releases, but I would guess that there is
> more we can do.
> 
> 2. Can Phoenix be the de-facto schema for SQL on HBase?
> 
> We've long asserted "if you have to ask how Phoenix serializes data, you
> shouldn't be do it" (a nod that you have to write lots of code). What if
> we turn that on its head? Could we extract our PDataType serialization,
> composite row-key, column encoding, etc into a minimal API that folks
> with their own itches can use?
> 
> With the growing integrations into Phoenix, we could embrace them by
> providing an API to make what they're doing easier. In the same vein, we
> cement ourselves as a cornerstone of doing it "correctly".
> 
> 3. Better recommendations to users to not attempt certain queries.
> 
> We definitively know that there are certain types of queries that
> Phoenix cannot support well (compared to optimal Phoenix use-cases).
> Users very commonly fall into such pitfalls on their own and this leaves
> a bad taste in their mouth (thinking that the product "stinks").
> 
> Can we do a better job of telling the user when and why it happened?
> What would such a user-interaction model look like? Can we supplement
> the "why" with instructions of what to do differently (even if in the
> abstract)?
> 
> 4. Phoenix-Calcite
> 
> This was mentioned as a "nice to have". From what I understand, there
> was nothing explicitly from with the implementation or approach, just
> that it was a massive undertaking to continue with little immediate
> gain. Would this be a boon for us to try to continue in some form? Are
> there steps we can take that would help push us along the right path?
> 
> Anyways, I'd love to hear everyone's thoughts. While the concerns were
> raised at HBaseCon Asia, the suggestions that accompany them here are
> largely mine ;). Feel free to break them out into their own threads if
> you think that would be better (or say that you disagree with me --
> that's cool too)!
> 
> - Josh
>    
> 
  

Re: Salting based on partial rowkeys

2018-09-16 Thread la...@apache.org
 I added some comments on the PHOENIX-4757

On Thursday, September 13, 2018, 6:42:12 PM PDT, Josh Elser 
 wrote:  
 
 Ahh, I get you now.

For a composite primary key made up of columns 1 through N, you want 
similar controls to compute the value of the salt based on a sequence of 
the columns 1 through M where M <= N (instead of always on all columns).

For large numbers of salt buckets and a scan over a facet, you prune 
your search space considerably. Makes sense to me!

On 9/13/18 6:37 PM, Gerald Sangudi wrote:
> In case the text formatting is lost below, I also added it as a comment in
> the JIRA ticket:
> 
> https://issues.apache.org/jira/browse/PHOENIX-4757
> 
> 
> On Thu, Sep 13, 2018 at 3:24 PM, Gerald Sangudi 
> wrote:
> 
>> Sorry I missed Josh's reply; I've subscribed to the dev list now.
>>
>> Below is a copy-and-paste from our internal document. Thanks in advance
>> for your review and additional feedback on this.
>>
>> Gerald
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *BackgroundWe make extensive use of multi-column rowkeys and salting
>>  in our different apache phoenix
>> deployments. We frequently perform group-by aggregations on these data
>> along a specific dimension that would benefit from predictably partitioning
>> the data along that dimension. Proposal:We propose to add table metadata to
>> allow schema designers to constrain salting to a subset of the rowkey,
>> rather than the full rowkey as it is today. This will introduce a mechanism
>> to partition data on a per-table basis along a single dimension without
>> application changes or much change to the phoenix runtime logic. We expect
>> this will result in substantially faster group-by’s along the salted
>> dimension and negligible penalties elsewhere. This feature has also been
>> proposed in PHOENIX-4757
>>  where it was pointed
>> out that partitioning and sorting data along different dimensions is a
>> common pattern in other datastores as well.Theoretically, it could cause
>> hotspotting when querying along the salted dimension without the leading
>> rowkey - that would be an anti-pattern.Usage ExampleCurrent:Schema:CREATE
>> TABLE relationship (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key
>> BIGINT NOT NULL,val SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
>> other_key))SALT_BUCKETS=60;Query:Select id_2, sum(val)From
>> relationshipWhere id_1 in (2,3)Group by id_2Explain:0: jdbc:phoenix:>
>> EXPLAIN Select id_2, sum(val) From relationship Where id_1 in (2,3) Group
>> by id_2
>> ;+-++|
>>                                          PLAN    | EST_BY
>> |+-++|
>> CLIENT 60-CHUNK PARALLEL 60-WAY SKIP SCAN ON 120 KEYS OVER RELATIONSHIP
>> [0,2] - [59,3]  | null ||    SERVER AGGREGATE INTO DISTINCT ROWS BY [ID_2]
>>                                        | null || CLIENT MERGE SORT
>>                                                                        |
>> null
>> |+-++3
>> rows selected (0.048 seconds)In this case, although the group by is
>> performed on both the client and regionserver, almost all of the actual
>> grouping happens on the client because the id_2’s are randomly distributed
>> across the regionservers. As a result, a lot of unnecessary data is
>> serialized to the client and grouped serially there. This can become quite
>> material with large resultsets.Proposed:Schema:CREATE TABLE relationship
>> (id_1 BIGINT NOT NULL,id_2 BIGINT NOT NULL,other_key BIGINT NOT NULL,val
>> SMALLINT,CONSTRAINT pk PRIMARY KEY (id_1, id_2,
>> other_key),SALT_BUCKETS=60,SALT_COLUMN = id_2);Query (unchanged):Select
>> id_2, sum(val)From relationshipWhere id_1 in (2,3)Group by id_2Explain
>> (unchanged)Under the proposal, the data are merely partitioned so that all
>> rows containing the same id_2 are on the same regionserver, the above query
>> will perform almost all of the grouping in parallel on the regionservers.
>> No special hint or changes to the query plan would be required to benefit.
>> Tables would need to be re-salted to take advantage of the new
>> functionality.Technical changes proposed to phoenix: - Create a new piece
>> of table-level metadata: SALT_COLUMN. SALT_COLUMN will instruct the salting
>> logic to generate a salt-byte based only on the specified column. If
>> unspecified, it will behave as it does today and default to salting the
>> entire rowkey. This metadata may be specified only when the table is
>> created and may not be modified. The specified column must be part of the
>> rowkey.  - Modify all callers of getSaltingByte
>>