Re: Phoenix Mapreduce

2019-04-30 Thread Shawn Li
can launch Mappers on the same node of the > RegionServer hosting your Region and avoid any reading any data over the > network. > > This is just an optimization. > > On 4/30/19 10:12 AM, Shawn Li wrote: > > Hi, > > > > The number of Map in Phoenix Mapreduc

Phoenix Mapreduce

2019-04-30 Thread Shawn Li
Hi, The number of Map in Phoenix Mapreduce is determined by table region number. My question is: if the region is split due to other injection process while Phoenix Mapreduce job is running, do we lose reading some data due to this split? As now we have more regions than maps, and the maps only

How to decode composite rowkey back to original primary keys

2019-01-14 Thread Shawn Li
Hi, Phoenix encodes composite key to hbase rowkey. We want to check if there any documentation or example to show how to manually decode Hbase rowkey back to original values for those primary keys. Or is there any phoenix source code we can directly use to do this? Thanks, Shawn

Re: column mapping schema decoding

2019-01-02 Thread Shawn Li
Hi Jaanai and Pedro, Any input for my example? Thanks, Shawn On Thu, Dec 27, 2018, 12:34 Shawn Li Hi Jaanai, > > Thanks for the input. So the encoding schema is not simple first come > first assigned (such as in my example: A.population -> 1, A.type -> 2; > B.zipcode -

Re: column mapping schema decoding

2018-12-27 Thread Shawn Li
use the original column is better. > > > > > ---- > Jaanai Zhang >Best regards! > > > > Shawn Li 于2018年12月27日周四 上午7:17写道: > >> Hi Pedro, >> >> Thanks for reply. Can you explain a little bit more? For example, if we >> use COLU

Re: column mapping schema decoding

2018-12-26 Thread Shawn Li
o. > > > > On Wed, 26 Dec 2018, 23:00 Shawn Li >> Hi, >> >> Phoenix 4.10 introduced column mapping feature. There are four types of >> mapping schema (https://phoenix.apache.org/columnencoding.html). Is >> there any documentation that shows how to encode/

column mapping schema decoding

2018-12-26 Thread Shawn Li
Hi, Phoenix 4.10 introduced column mapping feature. There are four types of mapping schema (https://phoenix.apache.org/columnencoding.html). Is there any documentation that shows how to encode/map string column name in Phoenix to number column qualifier in Hbase? We are using Lily Hbase indexer

Re: "upsert select" with "limit" clause

2018-12-20 Thread Shawn Li
; Vincent > > On Tue, Dec 18, 2018 at 2:31 PM Vincent Poon > wrote: > >> >> Shawn, that sounds like a bug, I would file a JIRA. >> >> On Tue, Dec 18, 2018 at 12:33 PM Shawn Li wrote: >> >>> Hi Vincent & William, >>> >>> &g

Re: "upsert select" with "limit" clause

2018-12-18 Thread Shawn Li
be substantially slower than the > other. > > Vincent > > On Mon, Dec 17, 2018 at 9:14 PM Shawn Li wrote: > >> Hi Jonathan, >> >> The single threaded on one side sounds logical to me. Hopefully Vincent >> can confirm it. >> >> Thanks, >&

Re: "upsert select" with "limit" clause

2018-12-17 Thread Shawn Li
upsert. > > On Dec 17, 2018, at 6:43 PM, Shawn Li wrote: > > Hi Vincent, > > Thanks for explaining. That makes much more sense now and it explains the > high memory usage when without "limit" clause. Because it upserts much > quickly when using "upsert select&q

Re: "upsert select" with "limit" clause

2018-12-17 Thread Shawn Li
here are all these hurdles is because it's generally not > recommended to do server-side upsert select across different tables, > because that means you're doing cross-regionserver RPCs (e.g. read data > from a region of sourcetable, and write to a region of targettable on a >

Re: "upsert select" with "limit" clause

2018-12-16 Thread Shawn Li
------ >Jaanai Zhang >Best regards! > > > > Shawn Li 于2018年12月13日周四 下午12:10写道: > >> Hi Jaanai, >> >> Thanks for putting your thought. The behavior you describe is correct on >> the Hbase region sever side. The memory usage for blockcache

Re: "upsert select" with "limit" clause

2018-12-12 Thread Shawn Li
imit, which will read the source table and write > the target tables on the server side. I think the higher memory usage is > caused by using scan cache and memstore under the higher throughput. > > >Jaanai Zhang >Best regards! &g

Re: "upsert select" with "limit" clause

2018-12-12 Thread Shawn Li
0, if you > have more than one regionserver. So instead results are sent back to the > client, where the LIMIT is applied and then written back to the server in > the UPSERT. > > On Wed, Dec 12, 2018 at 1:18 PM Shawn Li wrote: > >> Hi Vincent, >> >> >> &g

Re: "upsert select" with "limit" clause

2018-12-12 Thread Shawn Li
Hi Vincent, The table creation statement is similar to below. We have about 200 fields. Table is mutable and don’t have any index on the table. CREATE TABLE IF NOT EXISTS us_population ( state CHAR(2) NOT NULL, city VARCHAR, population BIGINT, … CONSTRAINT