Re: Is 200k data in a column a big concern?

2019-06-24 Thread Jaanai Zhang
could you please show your SQL?   Which kind of requests are you said?


   Jaanai Zhang
   Best regards!



jesse  于2019年6月21日周五 上午9:37写道:

> It seems the write take a long time and the system substantially slows
> down with requests.
>
> however, hbase official doc mentions soft limit is 32mb.
>


Re: is Apache phoenix reliable enough?

2019-06-24 Thread Jaanai Zhang
To be honest, The stability of Phoenix is a big problem, the following is
commonly behaviors :
1.  Hanging on the server side, the queries slow down, frequently OOM.
These troubles are often caused by full scan, commonly we can use secondary
indexes to avoid. It is important that make sure filters of where clause
will not scan large ranges of the data table.
2.  Indexes data out of sync. I think there is a substantial improvement,
the serval issues about this problem on the recent version.

Thinking whether it will be fine if uses Phoenix on your business
scenarios. Phoenix has a lot of functions, but some functions about
analytics are inefficient under massive data scenarios.

We have almost 200 clusters running Phoenix and solved many business
requirements.


   Jaanai Zhang
   Best regards!



Flavio Pompermaier  于2019年6月24日周一 下午5:30写道:

> I also faced many stability problems with Phoenix..it's very complicated
> to tune the tables in order to have decent performance for all kind of
> queries.
> Since we need to be performant for every type of query (analytics and
> exploration) we use Elasticsearch + Join plugin (i.e. Siren platform [1])
>
> Best,
> Flavio
>
> [1] https://siren.io/welcome-siren-platform/
>
> On Mon, Jun 24, 2019 at 8:02 AM Hengesbach, Martin <
> martin.hengesb...@fiz-karlsruhe.de> wrote:
>
>> Hi,
>>
>>
>>
>> we are using Phoenix in production since more than 2 years. We are quite
>> unhappy with the reliability of Phoenix. We migrated from Oracle because of
>> the performance (we have tables with up to 200M records, each record up to
>> 30 MB in size, up to 10 selective columns). Phoenix is really much faster
>> than Oracle (20 nodes cluster).
>>
>>
>>
>> But we have problems with
>>
>> · Phoenix totally hanging, only restart helps (We were able to
>> reduce this from daily to monthly)
>>
>> · incorrect indices, need rebuild
>>
>> · select statements not producing the expected (specified)
>> results
>>
>> · Upserts sporadically not working without error message
>>
>> · Some not reproducible errors
>>
>> · …
>>
>>
>>
>> We are thinking about switching to another database, but the question is:
>> what is better (reliable and performant)?
>>
>>
>>
>> Conclusion: With the current status of Phoenix, I would never use it
>> again.
>>
>>
>>
>> Regards
>>
>> Martin
>>
>>
>>
>>
>>
>>
>>
>> *Von:* jesse [mailto:chat2je...@gmail.com]
>> *Gesendet:* Samstag, 22. Juni 2019 20:04
>> *An:* user@phoenix.apache.org
>> *Betreff:* is Apache phoenix reliable enough?
>>
>>
>>
>> I stumbled on this post:
>>
>>
>> https://medium.com/@vkrava4/fighting-with-apache-phoenix-secondary-indexes-163495bcb361
>>
>>
>> and the bug:
>>
>>https://issues.apache.org/jira/browse/PHOENIX-5287
>>
>>
>>  I had a similar very frustrating experience with Phoenix, In addition to
>> various performance issues, you can found one of my posts about the
>> reliability issue on the mail-list.
>>
>>
>>
>>
>> https://lists.apache.org/thread.html/231b175fce8811d474cceb1fe270a3dd6b30c9eff2150ac42ddef0dc@%3Cuser.phoenix.apache.org%3E
>>
>>
>>
>>  just wondering others experience if you could share
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> FIZ Karlsruhe - Leibniz-Institut für Informationsinfrastruktur GmbH.
>> Sitz der Gesellschaft: Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB
>> 101892.
>> Geschäftsführerin: Sabine Brünger-Weilandt.
>> Vorsitzende des Aufsichtsrats: MinDirig'in Dr. Angelika Willms-Herget.
>> FIZ Karlsruhe ist zertifiziert mit dem Siegel "audit berufundfamilie".
>>
>
>


Re: COALESCE Function Not Working With NULL Values

2019-05-14 Thread Jaanai Zhang
Hi, Jestan

Now Phoenix 5.0.0 is not compatible with HBase 2.0.5,
https://issues.apache.org/jira/browse/PHOENIX-5268


   Jaanai Zhang
   Best regards!



Jestan Nirojan  于2019年5月15日周三 上午5:04写道:

> Hi William,
>
> Thanks, It is working with coalesce(functionThatMightReturnNull(), now())
> without an explicit null;
>
> Phoenix Version is 5.0.0.0 which uses HBase 2.0.5
> I have not opened any issue for this, I am not sure how it is suppose to
> work.
>
> I am developing  a phoenix driver for metabase <https://metabase.com/> (which
> is a BI/DataViz tool).
> It seems for optional query parameter, null values are directly set by the
> base metabase driver which I am trying to extend.
>
> I wish if phoenix can support explicit null values.
>
> thanks and regards,
> -Jestan
>
>
> On Tue, May 14, 2019 at 11:52 PM William Shen 
> wrote:
>
>> Just took a look at the implementation, seems like Phoenix relies on the
>> first expression to not be an expression that is not just an explicit
>> "null" because it needs to evaluate for data type coercion. What's the use
>> case for specifying an explicit null?
>>
>> On the other hand, the following should work:
>> select coalesce(functionThatMightReturnNull(), now()) as date;
>>
>> On Tue, May 14, 2019 at 11:14 AM William Shen 
>> wrote:
>>
>>> Jestan,
>>> It seems like a bug to me. What version of Phoenix are you using, and
>>> did you create a ticket already?
>>>
>>> On Tue, May 14, 2019 at 10:26 AM Jestan Nirojan 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to use COALESCE function to handle default value in WHERE
>>>> condition like below.
>>>>
>>>> select  * from table1 where created_date >= coalesce(null, trunc(now(),
>>>> 'day'));
>>>>
>>>> But it throws NullPointerException
>>>>
>>>> Caused by: java.lang.NullPointerException
>>>> at
>>>> org.apache.phoenix.schema.types.PDataType.equalsAny(PDataType.java:326)
>>>> at org.apache.phoenix.schema.types.PDate.isCoercibleTo(PDate.java:111)
>>>> at
>>>> org.apache.phoenix.expression.function.CoalesceFunction.(CoalesceFunction.java:68)
>>>> ... 47 more
>>>>
>>>> I was able to reproduce the same error with following query
>>>>
>>>> select coalesce(null, now()) as date;
>>>>
>>>> Here are some other variant of same issue
>>>>
>>>> 1. select coalesce(now(), now()) as date; // returns 2019-05-14
>>>> 2. select coalesce(now(), null) as date; // returns empty
>>>> 3. select coalesce(null, now()) as date; // throws exception
>>>>
>>>> I have tried the same for INT and VARCHAR, same outcome
>>>> Am I doing something wrong here or is coalesce suppose to return a non
>>>> null value ?
>>>>
>>>> thanks and regards,
>>>> -Jestan Nirojan
>>>>
>>>


Re: Query logging - PHOENIX-2715

2019-04-23 Thread Jaanai Zhang
Log level has four levels: OFF, INFO, DEBUG, and TRACE, the default value
of log level is OFF.  we can config by setting "phoenix.log.level" to log
each query on the client side.

----
   Jaanai Zhang
   Best regards!



M. Aaron Bossert  于2019年4月20日周六 上午8:12写道:

> Sorry if I have missed something obvious, but I saw that this was
> implemented (according to JIRA) in 4.14 and 5.0.0.  I need to set this up
> to log each query in that SYSTEM:LOG table, but cannot seem to find the
> official instructions for how to configure this through LOG4J settings or
> whatever it is now.  Could someone please point me in the right direction?
>
> On Tue, Nov 20, 2018 at 9:31 PM Jaanai Zhang 
> wrote:
>
>> yep!  That configuration options do not exist.
>>
>> 
>>Jaanai Zhang
>>Best regards!
>>
>>
>>
>> Curtis Howard  于2018年11月20日周二 下午11:42写道:
>>
>>> Hi Jaanai,
>>>
>>> Thanks for your suggestion.  Just confirming then - it sounds like this
>>> would involve adding custom code, as there are no current configuration
>>> options to enable logging/capture of all query statements (including
>>> DDL/DML) from clients?
>>>
>>> Thanks for your help
>>> Curtis
>>>
>>> On Tue, Nov 20, 2018 at 8:33 AM Jaanai Zhang 
>>> wrote:
>>>
>>>> We can't capture some detail information about the DDL/DML operations
>>>> by TRACE log. I suggest that you can print logs of these operations on the
>>>> logic layer.
>>>>
>>>> 
>>>>Jaanai Zhang
>>>>Best regards!
>>>>
>>>>
>>>>
>>>> Curtis Howard  于2018年11月20日周二 上午11:20写道:
>>>>
>>>>> Hi,
>>>>>
>>>>> Is the expected behavior for this new feature to capture all
>>>>> operations (UPSERT / DROP / CREATE / ...)?  After enabling
>>>>> phoenix.log.level=TRACE, I see only SELECT queries populated in the
>>>>> SYSTEM.LOG table.
>>>>>
>>>>> Thanks!
>>>>> Curtis
>>>>>
>>>>>


Re: Arithmetic Error in a select query

2019-02-13 Thread Jaanai Zhang
Yes, that is the region server logs. You mentioned the tracing features
that allow you to find some important metrics in the path of query or
insert, please see: https://phoenix.apache.org/tracing.html, the region
server logs just provide logging information of RS(just including logs of
the server sides).


   Jaanai Zhang
   Best regards!



talluri abhishek  于2019年2月14日周四 上午11:23写道:

> Thanks, Jaanai.
>
> For the second question, did you mean the region server logs?
>
> Also, I see that Phoenix has tracing features that we can enable. Could we
> enable it to get more information or what is it that tracing provides that
> region server logs cannot?
>
> -AT
>
> On Thu, Feb 14, 2019 at 8:37 AM Jaanai Zhang 
> wrote:
>
>> 1. How do we capture the debug level logs on this jdbc client? Should we
>>> also enable debug level on the region servers to understand what is
>>> triggering this error?
>>>
>>
>> You can set the logging level in the log4j configuration file on the
>> client side, most of the time we can disable the debug level.
>>
>> 2. Primary key on that table had a not null constraint and not sure why
>>> the error was stating null?
>>>
>>
>> This is an error of the server side, perhaps you can find some exceptions
>> from the log files.
>>
>> 
>>Jaanai Zhang
>>Best regards!
>>
>>
>>
>> talluri abhishek  于2019年2月13日周三 上午11:06写道:
>>
>>> Hi All,
>>>
>>> I have a `select primary_key from table_name limit 50` query that
>>> works most of the time but it returns the below error at times.
>>> ERROR 212 (22012): Arithmetic error on server. ERROR 212 (22012):
>>> Arithmetic error on server. null
>>> We are using a jdbc client to query phoenix and have the following
>>> questions in mind
>>>
>>> 1. How do we capture the debug level logs on this jdbc client? Should we
>>> also enable debug level on the region servers to understand what is
>>> triggering this error?
>>> 2. Primary key on that table had a not null constraint and not sure why
>>> the error was stating null?
>>>
>>> Thanks,
>>> Abhishek
>>>
>>


Re: Arithmetic Error in a select query

2019-02-13 Thread Jaanai Zhang
>
> 1. How do we capture the debug level logs on this jdbc client? Should we
> also enable debug level on the region servers to understand what is
> triggering this error?
>

You can set the logging level in the log4j configuration file on the client
side, most of the time we can disable the debug level.

2. Primary key on that table had a not null constraint and not sure why the
> error was stating null?
>

This is an error of the server side, perhaps you can find some exceptions
from the log files.

--------
   Jaanai Zhang
   Best regards!



talluri abhishek  于2019年2月13日周三 上午11:06写道:

> Hi All,
>
> I have a `select primary_key from table_name limit 50` query that
> works most of the time but it returns the below error at times.
> ERROR 212 (22012): Arithmetic error on server. ERROR 212 (22012):
> Arithmetic error on server. null
> We are using a jdbc client to query phoenix and have the following
> questions in mind
>
> 1. How do we capture the debug level logs on this jdbc client? Should we
> also enable debug level on the region servers to understand what is
> triggering this error?
> 2. Primary key on that table had a not null constraint and not sure why
> the error was stating null?
>
> Thanks,
> Abhishek
>


Re: Phoenix JDBC Connection Warmup

2019-02-01 Thread Jaanai Zhang
>
>  we experimented with issuing the same query repeatedly, and we observed a
> slow down not only on the first query

I am not sure what the reasons are, perhaps you can enable TRACE log to
find what leads to slow,  I guess that some meta information is reloaded
under highly write workload.

----
   Jaanai Zhang
   Best regards!



William Shen  于2019年2月1日周五 上午2:09写道:

> Thanks Jaanai. Do you know if that is expected only on the first query
> against a table? For us, we experimented with issuing the same query
> repeatedly, and we observed a slow down not only on the first query. Does
> it make sense to preemptively load table metadata on start up to warm up
> the system to reduce latency during the actual query time (if it is
> possible to do so)?
>
> On Wed, Jan 30, 2019 at 10:54 PM Jaanai Zhang 
> wrote:
>
>> It is expected when firstly query tables after establishing the
>> connection. Something likes loads some meta information into local cache
>> that need take some time,  mainly including two aspects: 1. access
>> SYSTEM.CATALOG table to get schema information of the table  2. access the
>> meta table of HBase to get regions information of the table
>>
>> 
>>Jaanai Zhang
>>Best regards!
>>
>>
>>
>> William Shen  于2019年1月31日周四 下午1:37写道:
>>
>>> Hi there,
>>>
>>> I have a component that makes Phoenix queries via the Phoenix JDBC
>>> Connection. I noticed that consistently, the Phoenix Client takes longer to
>>> execute a PreparedStatement and it takes longer to read through the
>>> ResultSet for a period of time (~15m) after a restart of the component. It
>>> seems like there is a warmup period for the JDBC connection. Is this to be
>>> expected?
>>>
>>> Thanks!
>>>
>>


Re: Phoenix JDBC Connection Warmup

2019-01-30 Thread Jaanai Zhang
It is expected when firstly query tables after establishing the connection.
Something likes loads some meta information into local cache that need take
some time,  mainly including two aspects: 1. access SYSTEM.CATALOG table to
get schema information of the table  2. access the meta table of HBase to
get regions information of the table


   Jaanai Zhang
   Best regards!



William Shen  于2019年1月31日周四 下午1:37写道:

> Hi there,
>
> I have a component that makes Phoenix queries via the Phoenix JDBC
> Connection. I noticed that consistently, the Phoenix Client takes longer to
> execute a PreparedStatement and it takes longer to read through the
> ResultSet for a period of time (~15m) after a restart of the component. It
> seems like there is a warmup period for the JDBC connection. Is this to be
> expected?
>
> Thanks!
>


Re: column mapping schema decoding

2018-12-26 Thread Jaanai Zhang
The actual column name and encoded qualifier number are stored in
SYSTEM.CATALOG
table, the field names are COLUMN_NAME(string) and COLUMN_QUALIFIER(binary)
respectively, QualifierEncodingScheme can be used to decode/encode
COLUMN_QUALIFIER, but this is a little complicated process.

For your scenario, maybe use the original column is better.





   Jaanai Zhang
   Best regards!



Shawn Li  于2018年12月27日周四 上午7:17写道:

> Hi Pedro,
>
> Thanks for reply. Can you explain a little bit more? For example, if we
> use COLUMN_ENCODED_BYTES = 1,How is the following table DDL converted to
> numbered column qualifier in Hbase? (such as A.population maps which
> number, B.zipcode Map to which number in Hbase)
>
> CREATE TABLE IF NOT EXISTS us_population (
>   state CHAR(2) NOT NULL,
>   city VARCHAR NOT NULL,
>   A.population BIGINT,
>   A.type CHAR,
>   B.zipcode CHAR(5),
>   B.quantity INT CONSTRAINT my_pk PRIMARY KEY (state, city));
>
>
> Thanks,
> Shawn
>
> On Wed, Dec 26, 2018 at 6:00 PM Pedro Boado  wrote:
>
>> Hi,
>>
>> Column mapping is stored in SYSTEM.CATALOG table . There is only one
>> column mapping strategy with between 1 to 4 bytes to be used to represent
>> column number. Regardless of encoded column size, column name lookup
>> strategy remains the same.
>>
>> Hope it helps,
>>
>> Pedro.
>>
>>
>>
>> On Wed, 26 Dec 2018, 23:00 Shawn Li >
>>> Hi,
>>>
>>> Phoenix 4.10 introduced column mapping feature. There are four types of
>>> mapping schema (https://phoenix.apache.org/columnencoding.html). Is
>>> there any documentation that shows how to encode/map string column name in
>>> Phoenix to number column qualifier in Hbase?
>>>
>>> We are using Lily Hbase indexer to do the batch indexing. So if the
>>> column qualifier is number. We need find a way to decode it back to the
>>> original String column name.
>>>
>>> Thanks,
>>> Shawn
>>>
>>


Re: Phoenix perform full scan and ignore covered global index

2018-12-23 Thread Jaanai Zhang
Could you please show your SQL of the CREATE TABLE/INDEX


   Jaanai Zhang
   Best regards!




Batyrshin Alexander <0x62...@gmail.com> 于2018年12月23日周日 下午9:38写道:

> Examples:
>
> 1. Ignoring indexes if "*" used for select even index include all columns
> from source table
>
> 0: jdbc:phoenix:127.0.0.1> explain select * from table where "p" =
> '123123123';
>
> +---+-+++
> | PLAN
>   | EST_BYTES_READ  | EST_ROWS_READ  |
> EST_INFO_TS   |
>
> +---+-+++
> | CLIENT 1608-CHUNK 237983037 ROWS 160746749821 BYTES PARALLEL 30-WAY FULL
> SCAN OVER table  | 160746749821| 237983037  | 1545484493647  |
> | SERVER FILTER BY d."p" = '123123123'
>   | 160746749821| 237983037  | 1545484493647  |
> | CLIENT MERGE SORT
>  | 160746749821| 237983037  |
> 1545484493647  |
>
> +---+-+++
> 3 rows selected (0.05 seconds)
>
>
> 2. Indexes used if only 1 column selected
>
> 0: jdbc:phoenix:127.0.0.1> explain select "c" from table where "p" =
> '123123123';
>
> +-+-+++
> |
>   PLAN
>| EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
>
> +-+-+++
> | CLIENT 30-CHUNK 3569628 ROWS 3145729398 BYTES PARALLEL 30-WAY RANGE SCAN
> OVER table_idx_p [0,'123123123'] - [29,'123123123']  | 3145729398  |
> 3569628| 1545484508039  |
> | SERVER FILTER BY FIRST KEY ONLY
>
>| 3145729398  | 3569628| 1545484508039  |
> | CLIENT MERGE SORT
>
>| 3145729398  | 3569628| 1545484508039  |
>
> +-+-+++
> 3 rows selected (0.038 seconds)
>
>
> 3.
>
> 0: jdbc:phoenix:127.0.0.1> explain select /*+ INDEX(table table_idx_p) */
> * from table where "p" = '123123123';
>
> +-+-+++
> |
>   PLAN
>| EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
>
> +-+-+++
> | CLIENT 1608-CHUNK 237983037 ROWS 160746749821 BYTES PARALLEL 30-WAY FULL
> SCAN OVER table
> | 3145729398  | 3569628| 1545484508039  |
> | CLIENT MERGE SORT
>
>| 3145729398  | 3569628| 1545484508039  |
> | SKIP-SCAN-JOIN TABLE 0
>
> | 3145729398  | 3569628| 1545484508039
> |
> | CLIENT 30-CHUNK 3569628 ROWS 3145729398 BYTES PARALLEL 30-WAY
> RANGE SCAN OVER table_idx_p [0,'123123123'] - [29,'123123123']  |
> 3145729398  | 3569628| 1545484508039  |
> | SERVER FILTER BY FIRST KEY ONLY
>
>| 3145729398  | 3569628| 1545484508039  |
> | CLIENT MERGE SORT
>
>| 3145729398  | 3569628| 1545484508039  |
> | DYNAMIC SERVER FILTER BY "table.c" IN ($35.$37)
>
>| 3145729398  | 3569628| 1545484508039  |
>
> +-+-+++
> 7 rows selected (0.12 seconds)
>
>
>


Re: "upsert select" with "limit" clause

2018-12-19 Thread Jaanai Zhang
Shawn,

I have done some tests in the 4.14.1-HBase-1.4 version. The detail
information is as follow:

CREATE TABLE test (id VARCHAR PRIMARY KEY, c1 varchar, c2 varchar)
SALT_BUCKETS = 10;

explain select * from test where c1 = 'xx' limit 5 offset 100;

CREATE TABLE test1 (id VARCHAR PRIMARY KEY, c1 varchar, c2 varchar)
SALT_BUCKETS = 10;

explain upsert into test1 select * from test limit 10;



0: jdbc:phoenix:thin:url=http://localhost:876> explain upsert into test1
select * from test limit 10;

+---+-++--+

|   PLAN
| EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |

+---+-++--+

| UPSERT SELECT
 | 2040| 10 | 0|

| CLIENT 10-CHUNK 10 ROWS 2040 BYTES *SERIAL* 10-WAY ROUND ROBIN FULL SCAN
OVER TEST  | 2040| 10 | 0|

| SERVER 10 ROW LIMIT
 | 2040| 10 | 0|

| CLIENT 10 ROW LIMIT
 | 2040| 10 | 0|

+---+-++--+

4 rows selected (0.028 seconds)

0: jdbc:phoenix:thin:url=http://localhost:876> explain upsert into test1
select * from test;

+--+-++--+

|   PLAN   |
EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |

+--+-++--+

| UPSERT SELECT| null
  | null   | null |

| CLIENT 10-CHUNK PARALLEL 10-WAY ROUND ROBIN FULL SCAN OVER TEST  | null
  | null   | null |

+--+-++--+

2 rows selected (0.033 seconds)


I notice that the UPSERT will produce serial scans with the limit clause.
what is your Phoenix's version?   @Vincent FYI






----
   Jaanai Zhang
   Best regards!



Vincent Poon  于2018年12月20日周四 上午6:04写道:

> Shawn,
>
> Took a quick look, I think what is happening is the UPSERT is done
> serially when you have LIMIT.
> Parallel scans are issued for the SELECT, which is why the explain plan
> shows PARALLEL, but then the results are concatenated via a single
> LimitingResultIterator, in order to apply the CLIENT LIMIT.
> The upsert then reads from that iterator and does the mutations in batches.
>
> To insert in parallel, we would need some sort of shared state between the
> writing threads to ensure we respect the limit, and I don't think we
> currently have something like that.
>
> Vincent
>
> On Tue, Dec 18, 2018 at 2:31 PM Vincent Poon 
> wrote:
>
>>
>> Shawn, that sounds like a bug, I would file a JIRA.
>>
>> On Tue, Dec 18, 2018 at 12:33 PM Shawn Li  wrote:
>>
>>> Hi Vincent & William,
>>>
>>>
>>>
>>> Below is the explain plan, both are PARALLEL excuted in plan:
>>>
>>>
>>>
>>> explain upsert into table1 select * from table2;
>>>
>>>
>>>
>>> UPSERT
>>> SELECT
>>>   |
>>>
>>> CLIENT 27-CHUNK 915799 ROWS 2831155510 BYTES PARALLEL 18-WAY ROUND ROBIN
>>> FULL SCAN OVER table2
>>>
>>>
>>>
>>> explain upsert into table1 select * from table2 limit 200;
>>>
>>>
>>>
>>> UPSERT
>>> SELECT
>>>   |
>>>
>>> | CLIENT 27-CHUNK 3600 ROWS 48114000 BYTES PARALLEL 18-WAY ROUND
>>> ROBIN FULL SCAN OVER table2 |
>>>
>>> | SERVER 200 ROW
>>> LIMIT
>>> |
>>>
>>> | CLIENT 200 ROW LIMIT
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Shawn
>>>
>>> On Tue, Dec 18, 2018, 13:30 Vincent Poon >>
>>>> Shawn,
>>>>
>>>> Can you do an "explain" to show what your two statements are doing?
>>>> That might give some clues.  Perhaps one is able to be run on the server
>>>> for some reason and the other is not.
>>>> Otherwise, I don't see why one would be substantia

Re: "upsert select" with "limit" clause

2018-12-13 Thread Jaanai Zhang
Shawn,

The UPSERT SELECT will run in a coprocessor on if it hasn't limit clause,
only query 1 table, the query is doing aggregation, no sequences and auto
commit is on. Please check your SQL ... and you can also check whether some
resources have not been released.


   Jaanai Zhang
   Best regards!



Shawn Li  于2018年12月13日周四 下午12:10写道:

> Hi Jaanai,
>
> Thanks for putting your thought. The behavior you describe is correct on
> the Hbase region sever side. The memory usage for blockcache and memstore
> will be high under such high throughput. But our phoenix client is on a
> gateway machine (no hbase region server sitting on it or any Hbase service
> on it), so not sure how to explain such high memory usage for upsert select
> without "limit" clause. The high memory usage behavior like all select
> results send to client machine, cached in client machine's memory, and then
> insert back to target table, which is not like the behavior that should
> happen, all of this should be done on the server side as the table schema
> is exactly the same. By the way, this happens on both Phoenix 4.7 and
> Phoenix 4.14.
>
>
> Thanks,
> Shawn
>
> On Wed, Dec 12, 2018 at 10:26 PM Jaanai Zhang 
> wrote:
>
>> Shawn,
>>
>>
>> For the upsert without limit,  which will read the source table and write
>> the target tables on the server side.  I think the higher memory usage is
>> caused by using scan cache and memstore under the higher throughput.
>>
>> 
>>Jaanai Zhang
>>Best regards!
>>
>>
>>
>> Shawn Li  于2018年12月13日周四 上午10:13写道:
>>
>>> Hi Vincent,
>>>
>>> So you describe limit will sent result to client side and then write to
>>> server, this might explain why upsert with limit is slower than without
>>> limit. But looks like it can't explain the memory usage? The memory usage
>>> on client machine is 8gb (without "limit") vs 2gb (with limit), sometime
>>> upsert without "limit" can even reach 20gb for big table.
>>>
>>> Thanks,
>>> Shawn
>>>
>>> On Wed, Dec 12, 2018 at 6:34 PM Vincent Poon 
>>> wrote:
>>>
>>>> I think it's done client-side if you have LIMIT.  If you have e.g.
>>>> LIMIT 1000 , it would be incorrect for each regionserver to upsert 100, if
>>>> you have more than one regionserver.  So instead results are sent back to
>>>> the client, where the LIMIT is applied and then written back to the server
>>>> in the UPSERT.
>>>>
>>>> On Wed, Dec 12, 2018 at 1:18 PM Shawn Li  wrote:
>>>>
>>>>> Hi Vincent,
>>>>>
>>>>>
>>>>>
>>>>> The table creation statement is similar to below. We have about 200
>>>>> fields. Table is mutable and don’t have any index on the table.
>>>>>
>>>>>
>>>>>
>>>>> CREATE TABLE IF NOT EXISTS us_population (
>>>>>
>>>>>   state CHAR(2) NOT NULL,
>>>>>
>>>>>   city VARCHAR,
>>>>>
>>>>>   population BIGINT,
>>>>>
>>>>>   …
>>>>>
>>>>>   CONSTRAINT my_pk PRIMARY KEY (state));
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Shawn
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Dec 12, 2018, 13:42 Vincent Poon >>>> wrote:
>>>>>
>>>>>> For #2, can you provide the table definition and the statement used?
>>>>>> e.g. Is the table immutable, or does it have indexes?
>>>>>>
>>>>>> On Tue, Dec 11, 2018 at 6:08 PM Shawn/Xiang Li 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 1.   Want to check what is underlying running for limit clause
>>>>>>> used in the following Upsert statement (is it involving any coprocessor
>>>>>>> working behind?):
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *  upsert into table2 select * from
>>>>>>> table1 limit 300; * (table 1 and table 2 have same schema)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   The above statement is running a lot slower than
>>>>>>> without “limit”  clause as shown in following, even the above statement
>>>>>>> upsert less data:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *upsert into table2 select * from
>>>>>>> table1;*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2.   We also observe memory usable is pretty high without the
>>>>>>> limit clause (8gb vs 2gb), sometimes for big table it can reach 20gb
>>>>>>> without using limit clause.  According to phoenix website description 
>>>>>>> for
>>>>>>> upsert select “If auto commit is on, and both a) the target table 
>>>>>>> matches
>>>>>>> the source table, and b) the select performs no aggregation, then the
>>>>>>> population of the target table will be done completely on the 
>>>>>>> server-side
>>>>>>> (with constraint violations logged, but otherwise ignored).”
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>My question is If everything is done on server-side,
>>>>>>> how come we have such high memory usage on the client machine?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Shawn
>>>>>>>
>>>>>>


Re: "upsert select" with "limit" clause

2018-12-12 Thread Jaanai Zhang
Shawn,


For the upsert without limit,  which will read the source table and write
the target tables on the server side.  I think the higher memory usage is
caused by using scan cache and memstore under the higher throughput.


   Jaanai Zhang
   Best regards!



Shawn Li  于2018年12月13日周四 上午10:13写道:

> Hi Vincent,
>
> So you describe limit will sent result to client side and then write to
> server, this might explain why upsert with limit is slower than without
> limit. But looks like it can't explain the memory usage? The memory usage
> on client machine is 8gb (without "limit") vs 2gb (with limit), sometime
> upsert without "limit" can even reach 20gb for big table.
>
> Thanks,
> Shawn
>
> On Wed, Dec 12, 2018 at 6:34 PM Vincent Poon 
> wrote:
>
>> I think it's done client-side if you have LIMIT.  If you have e.g. LIMIT
>> 1000 , it would be incorrect for each regionserver to upsert 100, if you
>> have more than one regionserver.  So instead results are sent back to the
>> client, where the LIMIT is applied and then written back to the server in
>> the UPSERT.
>>
>> On Wed, Dec 12, 2018 at 1:18 PM Shawn Li  wrote:
>>
>>> Hi Vincent,
>>>
>>>
>>>
>>> The table creation statement is similar to below. We have about 200
>>> fields. Table is mutable and don’t have any index on the table.
>>>
>>>
>>>
>>> CREATE TABLE IF NOT EXISTS us_population (
>>>
>>>   state CHAR(2) NOT NULL,
>>>
>>>   city VARCHAR,
>>>
>>>   population BIGINT,
>>>
>>>   …
>>>
>>>   CONSTRAINT my_pk PRIMARY KEY (state));
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Shawn
>>>
>>>
>>>
>>> On Wed, Dec 12, 2018, 13:42 Vincent Poon >>
>>>> For #2, can you provide the table definition and the statement used?
>>>> e.g. Is the table immutable, or does it have indexes?
>>>>
>>>> On Tue, Dec 11, 2018 at 6:08 PM Shawn/Xiang Li 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> 1.   Want to check what is underlying running for limit clause
>>>>> used in the following Upsert statement (is it involving any coprocessor
>>>>> working behind?):
>>>>>
>>>>>
>>>>>
>>>>> *  upsert into table2 select * from
>>>>> table1 limit 300; * (table 1 and table 2 have same schema)
>>>>>
>>>>>
>>>>>
>>>>>   The above statement is running a lot slower than without
>>>>> “limit”  clause as shown in following, even the above statement upsert 
>>>>> less
>>>>> data:
>>>>>
>>>>>
>>>>>
>>>>> *upsert into table2 select * from
>>>>> table1;*
>>>>>
>>>>>
>>>>>
>>>>> 2.   We also observe memory usable is pretty high without the
>>>>> limit clause (8gb vs 2gb), sometimes for big table it can reach 20gb
>>>>> without using limit clause.  According to phoenix website description for
>>>>> upsert select “If auto commit is on, and both a) the target table matches
>>>>> the source table, and b) the select performs no aggregation, then the
>>>>> population of the target table will be done completely on the server-side
>>>>> (with constraint violations logged, but otherwise ignored).”
>>>>>
>>>>>
>>>>>
>>>>>My question is If everything is done on server-side,
>>>>> how come we have such high memory usage on the client machine?
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Shawn
>>>>>
>>>>


Re: Hbase vs Phienix column names

2018-12-10 Thread Jaanai Zhang
The difference since used encode column names that support in 4.10
version(Also see PHOENIX-1598
<https://issues.apache.org/jira/browse/PHOENIX-1598>).
You can config COLUMN_ENCODED_BYTES property to keep the original column
names in the create table SQL, an example for:

create table test(

id varchar  primary key,

col varchar

)COLUMN_ENCODED_BYTES =0 ;



----
   Jaanai Zhang
   Best regards!



Anil  于2018年12月11日周二 下午1:24写道:

> HI,
>
> We have upgraded phoenix to Phoenix-4.11.0-cdh5.11.2 from phoenix 4.7.
>
> Problem - When a table is created in phoenix, underlying hbase column
> names and phoenix column names are different. Tables created in 4.7 version
> looks good. Looks
>
> CREATE TABLE TST_TEMP (TID VARCHAR PRIMARY KEY ,PRI VARCHAR,SFLG
> VARCHAR,PFLG VARCHAR,SOLTO VARCHAR,BILTO VARCHAR) COMPRESSION = 'SNAPPY';
>
> 0: jdbc:phoenix:dq-13.labs.> select TID,PRI,SFLG from TST_TEMP limit 2;
> +-++---+
> |   TID   |PRI |SFLG   |
> +-++---+
> | 0060189122  | 0.00   |   |
> | 0060298478  | 13390.26   |   |
> +-++---+
>
>
> hbase(main):011:0> scan 'TST_TEMP', {LIMIT => 2}
> ROW  COLUMN+CELL
>  0060189122  column=0:\x00\x00\x00\x00,
> timestamp=1544296959236, value=x
>  0060189122  column=0:\x80\x0B,
> timestamp=1544296959236, value=0.00
>  0060298478  column=0:\x00\x00\x00\x00,
> timestamp=1544296959236, value=x
>  0060298478  column=0:\x80\x0B,
> timestamp=1544296959236, value=13390.26
>
>
> hbase columns names are completely different than phoenix column names.
> This change observed only post up-gradation. all existing tables created in
> earlier versions looks good and alter statements to existing tables also
> looks good.
>
> Is there any workaround to avoid this difference? we could not run hbase
> mapreduce jobs on hbase tables created  by phoenix. Thanks.
>
> Thanks
>
>
>
>
>
>
>


Re: HBase Compaction Fails for Phoenix Table

2018-11-21 Thread Jaanai Zhang
Can you please attach Create the table/indexes SQL? is this Phoneix's
version?  Are you sure many rows data had been corrupted or only this one
row?


   Jaanai Zhang
   Best regards!



William Shen  于2018年11月21日周三 上午10:20写道:

> Hi there,
>
> We've encountered the following compaction failure(#1) for a Phoenix
> table, and are not sure how to make sense of it. Using the HBase row key
> from the error, we are able to query data directly from hbase shell, and by
> examining the data, there arent anything immediately obvious about the data
> as they seem to be stored consistent with the Phoenix data type (#2). When
> querying Phoenix for the given row through sqline, the row would return if
> only primary key columns are selected, and the query would not return if
> non primary key columns are selected (#3).
>
> Few questions hoping to find some help on:
>
> a. Are we correct in understanding the error message to indicate an issue
> with data for the row key (
> \x05\x80\x00\x00\x00\x00\x1FT\x9C\x80\x00\x00\x00\x00\x1C}E\x00\x04\x80\x00\x00\x00\x00\x1D\x0F\x19\x80\x00\x00\x00\x00Ij\x9D\x80\x00\x00\x00\x01\xD1W\x13)?
> We are not sure what to make sense of the string "
> 1539019716378.3dcf2b1e057915feb74395d9711ba4ad." that is included with
> the row key...
>
> b. What is out of bound here? It's not apparently clear here what
> StatisticsScanner and FastDiffDeltaEncoder are tripping over...
>
> c. Is it normal for hbase shell to return some parts of the hex string as
> ascii characters? We are seeing that in the row key as well as column name
> encoding and value. We are not sure if that is causing any issues, or if
> that was just a display issue that we can safely ignore.
>
> *#1 Compaction Failure*
>
> Compaction failed Request = 
> regionName=qa2.ADGROUPS,\x05\x80\x00\x00\x00\x00\x1FT\x9C\x80\x00\x00\x00\x00\x1C}E\x00\x04\x80\x00\x00\x00\x00\x1D\x0F\x19\x80\x00\x00\x00\x00Ij\x9D\x80\x00\x00\x00\x01\xD1W\x13,1539019716378.3dcf2b1e057915feb74395d9711ba4ad.,
>  storeName=AG, fileCount=4, fileSize=316.0 M (315.8 M, 188.7 K, 6.8 K, 14.2 
> K), priority=1, time=40613533856170784
> java.lang.IndexOutOfBoundsException
>   at java.nio.Buffer.checkBounds(Buffer.java:567)
>   at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:149)
>   at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:465)
>   at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:516)
>   at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:618)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.next(HFileReaderV2.java:1277)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:180)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:588)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:458)
>   at 
> org.apache.phoenix.schema.stats.StatisticsScanner.next(StatisticsScanner.java:69)
>   at 
> org.apache.phoenix.schema.stats.StatisticsScanner.next(StatisticsScanner.java:76)
>   at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:334)
>   at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:106)
>   at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:131)
>   at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1245)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1852)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:529)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:566)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)
>
>
> *#2 Data on the HBase level*
>
> hbase(main):002:0> get 'qa2.ADGROUPS', 
> "\x05\x80\x00\x00\x00\x00\x1FT\x9C\x80\x00\x00\x00\x00\x1C}E\x00\x04\x80\x00\x00\x00\x00\x1D\x0F\x19\x80\x00\x00\x00\x00Ij\x9D\x80\x00\x00\x00\x01\xD1W\x13"
> COLUMN  CELL
>  AG:\x00\x00\x00\x00timestamp=1539019457506, 

Re: Query logging - PHOENIX-2715

2018-11-20 Thread Jaanai Zhang
yep!  That configuration options do not exist.


   Jaanai Zhang
   Best regards!



Curtis Howard  于2018年11月20日周二 下午11:42写道:

> Hi Jaanai,
>
> Thanks for your suggestion.  Just confirming then - it sounds like this
> would involve adding custom code, as there are no current configuration
> options to enable logging/capture of all query statements (including
> DDL/DML) from clients?
>
> Thanks for your help
> Curtis
>
> On Tue, Nov 20, 2018 at 8:33 AM Jaanai Zhang 
> wrote:
>
>> We can't capture some detail information about the DDL/DML operations by
>> TRACE log. I suggest that you can print logs of these operations on the
>> logic layer.
>>
>> 
>>Jaanai Zhang
>>Best regards!
>>
>>
>>
>> Curtis Howard  于2018年11月20日周二 上午11:20写道:
>>
>>> Hi,
>>>
>>> Is the expected behavior for this new feature to capture all operations
>>> (UPSERT / DROP / CREATE / ...)?  After enabling phoenix.log.level=TRACE, I
>>> see only SELECT queries populated in the SYSTEM.LOG table.
>>>
>>> Thanks!
>>> Curtis
>>>
>>>


Re: Query logging - PHOENIX-2715

2018-11-20 Thread Jaanai Zhang
We can't capture some detail information about the DDL/DML operations by
TRACE log. I suggest that you can print logs of these operations on the
logic layer.


   Jaanai Zhang
   Best regards!



Curtis Howard  于2018年11月20日周二 上午11:20写道:

> Hi,
>
> Is the expected behavior for this new feature to capture all operations
> (UPSERT / DROP / CREATE / ...)?  After enabling phoenix.log.level=TRACE, I
> see only SELECT queries populated in the SYSTEM.LOG table.
>
> Thanks!
> Curtis
>
>


Re: Issue in upgrading phoenix : java.lang.ArrayIndexOutOfBoundsException: SYSTEM:CATALOG 63

2018-10-17 Thread Jaanai Zhang
It seems that is impossible to upgrade from Phoenix-4.6 to Phoenix-4.14,
the schema of SYSTEM  had been changed or some futures will be
incompatible.  Maybe you can migrate data from Phoenix-4.6 to Phoenix-4.14,
this solution can ensure that everything will be right.


   Jaanai Zhang
   Best regards!



Tanvi Bhandari  于2018年10月17日周三 下午3:48写道:

> @Shamvenk
>
> Yes I did check the STATS table from hbase shell, it's not empty.
>
> After dropping all SYSTEM tables and mapping hbase-tables to phoenix
> tables by executing all DDLs, I am seeing new issue.
>
> I have a table and an index on that table. Number of records in index
> table and main table are not matching now.
> select count(*) from "my_index";
> select count(COL) from "my_table";-- where COL is not part of index.
>
> Can someone tell me what can be done here? Is there any easier way to
> upgrade from Phoenix-4.6 to Phoenix-4.14?
>
>
>
> On Thu, Sep 13, 2018 at 8:55 PM venk sham  wrote:
>
>> Did you check system.stats,. If it us empty, needs to be rebuilt by
>> running major compact on hbasr
>>
>> On Tue, Sep 11, 2018, 11:33 AM Tanvi Bhandari 
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> I am trying to upgrade the phoenix binaries in my setup from phoenix-4.6
>>> (had optional concept of schema) to phoenix-4.14 (schema is a must in
>>> here).
>>>
>>> Earlier, I had the phoenix-4.6-hbase-1.1 binaries. When I try to run the
>>> phoenix-4.14-hbase-1.3 on the same data. Hbase comes up fine But when I try
>>> to connect to phoenix using sqline client,  I get the following error on
>>> *console*:
>>>
>>>
>>>
>>> 18/09/07 04:22:48 WARN ipc.CoprocessorRpcChannel: Call failed on
>>> IOException
>>>
>>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>> org.apache.hadoop.hbase.DoNotRetryIOException: SYSTEM:CATALOG: 63
>>>
>>> at
>>> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:120)
>>>
>>> at
>>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3572)
>>>
>>> at
>>> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16422)
>>>
>>> at
>>> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435)
>>>
>>> at
>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875)
>>>
>>> at
>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857)
>>>
>>> at
>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
>>>
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
>>>
>>> at
>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
>>>
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>>>
>>> at
>>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 63
>>>
>>> at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:517)
>>>
>>> at
>>> org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)
>>>
>>> at
>>> org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)
>>>
>>> at
>>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1046)
>>>
>>> at
>>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:587)
>>>
>>>at
>>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.loadTable(MetaDataEndpointImpl.java:1305)
>>>
>>> at
>>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3568)
>>>
>>> ... 10 more
>>>
>>>
>>>
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> Method)
>>>
>>> at
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.j

Re: Encountering BufferUnderflowException when querying from Phoenix

2018-10-14 Thread Jaanai Zhang
It looks a bug that the remained part greater than retrieved the length in
ByteBuffer, Maybe the position of ByteBuffer or the length of target byte
array exists some problems.


   Jaanai Zhang
   Best regards!



William Shen  于2018年10月12日周五 下午11:53写道:

> Hi all,
>
> We are running Phoenix 4.13, and periodically we would encounter the
> following exception when querying from Phoenix in our staging environment.
> Initially, we thought we had some incompatible client version connecting
> and creating data corruption, but after ensuring that we are only
> connecting with 4.13 clients, we still see this issue come up from time to
> time. So far, fortunately, since it is in staging, we are able to identify
> and delete the data to restore service.
>
> However, would like to ask for guidance on what else we could look for to
> identify the cause of this exception. Could this perhaps caused by
> something other than data corruption?
>
> Thanks in advance!
>
> The exception looks like:
>
> 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage
> 14.0 (TID 1275, ...datanode..., executor 82):
> java.nio.BufferUnderflowException
>
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
>
> at java.nio.ByteBuffer.get(ByteBuffer.java:715)
>
> at
> org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028)
>
> at
> org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375)
>
> at
> org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65)
>
> at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011)
>
> at
> org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75)
>
> at
> org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:525)
>
> at
> org.apache.phoenix.spark.PhoenixRecordWritable$$anonfun$readFields$1.apply$mcVI$sp(PhoenixRecordWritable.scala:96)
>
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
>
> at
> org.apache.phoenix.spark.PhoenixRecordWritable.readFields(PhoenixRecordWritable.scala:93)
>
> at
> org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:168)
>
> at
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:174)
>
> at
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>
> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1596)
>
> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
>
> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
>
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870)
>
> at
> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870)
>
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:229)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:748)
>
>
>


Re: Concurrent phoenix queries throw unable to create new native thread error

2018-10-10 Thread Jaanai Zhang
>
> Often times, concurrent queries fail with "java.lang.OutOfMemoryError:
> unable to create new native thread
>

You can try to adjust VM options which your client program.

connectionProps.setProperty("phoenix.query.threadPoolSize", "2000")
> connectionProps.setProperty("phoenix.query.querySize", "4")


Did you try to decrease values of the above configurations?

   Jaanai Zhang
   Best regards!



Hemal Parekh  于2018年10月11日周四 上午1:18写道:

> limits.conf has following which I thought were sufficient. I will check if
> these limits are getting exceeded.
>
> *   -   nofile   32768
> *   -   nproc   65536
>
>
>
> Thanks,
> Hemal
>
>
> On Wed, Oct 10, 2018 at 12:40 PM Pedro Boado 
> wrote:
>
>> Are you reaching any of the ulimits for the user running your application?
>>
>> On Wed, 10 Oct 2018, 17:00 Hemal Parekh,  wrote:
>>
>>> We have an analytical application running concurrent phoenix queries
>>> against Hortonworks HDP 2.6 cluster. Application uses phoenix JDBC
>>> connection to run queries. Often times, concurrent queries fail with
>>> "java.lang.OutOfMemoryError: unable to create new native thread" error.
>>> JDBC connection sets following phoenix properties.
>>>
>>> connectionProps.setProperty("phoenix.query.threadPoolSize", "2000")
>>> connectionProps.setProperty("phoenix.query.querySize", "4")
>>>
>>> Phoenix version is 4.7 and Hbase version is 1.1.2, The HDP cluster has
>>> six regionservers on six data nodes. Concurrent queries run against
>>> different phoenix tables, some are small having few million records and
>>> some are big having few billions records. Most of the queries do not have
>>> joins,  where clause includes conditions on rowkey and few nonkey columns.
>>> Queries with joins (which are on small tables) have used
>>> USE_SORT_MERGE_JOIN hint.
>>>
>>> Are there other phoenix properties which need to be set on JDBC
>>> connection? Are above values for phoenix.query.threadPoolSize and 
>>> phoenix.query.querySize
>>> enough to handle concurrent query use case? We have changed these two
>>> properties couple of times to increase their values but the error still
>>> remains the same.
>>>
>>>
>>> Thanks,
>>>
>>> Hemal Parekh
>>>
>>>
>>>
>>>
>
> --
>
> Hemal Parekh
> Senior Data Warehouse Architect
> m. 240.449.4396
> [image: Bitscopic Inc] <http://bitscopic.com>
>
>


Re: Table dead lock: ERROR 1120 (XCL20): Writes to table blocked until index can be updated

2018-09-29 Thread Jaanai Zhang
Did you restart the cluster and you should set 'hbase.hregion.max.filesize'
to a safeguard value which less than RS's capabilities.


   Jaanai Zhang
   Best regards!



Batyrshin Alexander <0x62...@gmail.com> 于2018年9月29日周六 下午5:28写道:

> Meanwhile we tried to disable regions split via per index table options
> 'SPLIT_POLICY' =>
> 'org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy'  and 
> hbase.hregion.max.filesize
> = 10737418240
> Looks like this options set doesn't. Some regions splits at size < 2GB
>
> Then we tried to disable all splits via hbase shell: splitormerge_switch
> 'SPLIT', false
> Seems that this also doesn't work.
>
> Any ideas why we can't disable regions split?
>
> On 27 Sep 2018, at 02:52, Vincent Poon  wrote:
>
> We are planning a Phoenix 4.14.1 release which will have this fix
>
> On Wed, Sep 26, 2018 at 3:36 PM Batyrshin Alexander <0x62...@gmail.com>
> wrote:
>
>> Thank you. We will try somehow...
>> Is there any chance that this fix will be included in next release for
>> HBASE-1.4 (not 2.0)?
>>
>> On 27 Sep 2018, at 01:04, Ankit Singhal  wrote:
>>
>> You might be hitting PHOENIX-4785
>> <https://jira.apache.org/jira/browse/PHOENIX-4785>,  you can apply the
>> patch on top of 4.14 and see if it fixes your problem.
>>
>> Regards,
>> Ankit Singhal
>>
>> On Wed, Sep 26, 2018 at 2:33 PM Batyrshin Alexander <0x62...@gmail.com>
>> wrote:
>>
>>> Any advices? Helps?
>>> I can reproduce problem and capture more logs if needed.
>>>
>>> On 21 Sep 2018, at 02:13, Batyrshin Alexander <0x62...@gmail.com> wrote:
>>>
>>> Looks like lock goes away 30 minutes after index region split.
>>> So i can assume that this issue comes from cache that configured by this
>>> option:* phoenix.coprocessor.maxMetaDataCacheTimeToLiveMs*
>>>
>>>
>>>
>>> On 21 Sep 2018, at 00:15, Batyrshin Alexander <0x62...@gmail.com> wrote:
>>>
>>> And how this split looks at Master logs:
>>>
>>> Sep 20 19:45:04 prod001 hbase[10838]: 2018-09-20 19:45:04,888 INFO
>>>  [AM.ZK.Worker-pool5-t282] master.RegionStates: Transition
>>> {3e44b85ddf407da831dbb9a871496986 state=OPEN,
>>> ts=1537304859509, server=prod013,60020,1537304282885} to
>>> {3e44b85ddf407da831dbb9a871496986 state=SPLITTING, ts=1537461904888,
>>> server=prod
>>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,340 INFO
>>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
>>> {3e44b85ddf407da831dbb9a871496986 state=SPLITTING, ts=1537461905340,
>>> server=prod013,60020,1537304282885} to {3e44b85ddf407da831dbb9a871496986
>>> state=SPLIT, ts=1537461905340, server=pro
>>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,340 INFO
>>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Offlined
>>> 3e44b85ddf407da831dbb9a871496986 from prod013,60020,1537304282885
>>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,341 INFO
>>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
>>> {33cba925c7acb347ac3f5e70e839c3cb state=SPLITTING_NEW, ts=1537461905340,
>>> server=prod013,60020,1537304282885} to {33cba925c7acb347ac3f5e70e839c3cb
>>> state=OPEN, ts=1537461905341, server=
>>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,341 INFO
>>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
>>> {acb8f16a004a894c8706f6e12cd26144 state=SPLITTING_NEW, ts=1537461905340,
>>> server=prod013,60020,1537304282885} to {acb8f16a004a894c8706f6e12cd26144
>>> state=OPEN, ts=1537461905341, server=
>>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,343 INFO
>>>  [AM.ZK.Worker-pool5-t284] master.AssignmentManager: Handled SPLIT
>>> event; 
>>> parent=IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1536637905252.3e44b85ddf407da831dbb9a871496986.,
>>> daughter a=IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1
>>> Sep 20 19:47:41 prod001 hbase[10838]: 2018-09-20 19:47:41,972 INFO
>>>  [prod001,6,1537304851459_ChoreService_2]
>>> balancer.StochasticLoadBalancer: Skipping load balancing because balanced
>>> cluster; total cost is 17.82282205608522, sum multiplier is 1102.0 min cost
>>> which need balance is 0.05
>>> Sep 20 19:47:42 prod001 hbase[10838]: 2018-09-20 19:47:42,021 INFO
>>>  [prod001,6,1537304851459_ChoreService_1] hbase.MetaTableAccessor:
>>> Deleted 
>>> IDX_MARK_O,\x107834005168\x0

Re: Phoenix 5.0 could not commit transaction: org.apache.phoenix.execute.CommitException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: org.apache.phoenix.hbase

2018-09-25 Thread Jaanai Zhang
>
>
> Is my method of installing HBase and Phoenix correct?
>
Did you check versions of HBase that exists in your classpath?

Is this a compatibility issue with Guava?

It isn't an exception which incompatible with Guava

----
   Jaanai Zhang
   Best regards!



Francis Chuang  于2018年9月25日周二 下午8:25写道:

> Thanks for taking a look, Jaanai!
>
> Is my method of installing HBase and Phoenix correct? See
> https://github.com/Boostport/hbase-phoenix-all-in-one/blob/master/Dockerfile#L12
>
> Is this a compatibility issue with Guava?
>
> Francis
>
> On 25/09/2018 10:21 PM, Jaanai Zhang wrote:
>
> org.apache.phoenix.hbase.index.covered.data.IndexMemStore$1 overrides
>> final method
>> compare.(Lorg/apache/hadoop/hbase/Cell;Lorg/apache/hadoop/hbase/Cell;)I
>>  at java.lang.ClassLoader.defineClass1(Native Method)
>>  at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>>  at
>>
>
> It looks like that HBase's Jars are incompatible.
>
> 
>Jaanai Zhang
>Best regards!
>
>
>
> Francis Chuang  于2018年9月25日周二 下午8:06写道:
>
>> Hi All,
>>
>> I recently updated one of my Go apps to use Phoenix 5.0 with HBase
>> 2.0.2. I am using my Phoenix + HBase all in one docker image available
>> here: https://github.com/Boostport/hbase-phoenix-all-in-one
>>
>> This is the log/output from the exception:
>>
>> RuntimeException: org.apache.phoenix.execute.CommitException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
>> Failed 1 action:
>> org.apache.phoenix.hbase.index.builder.IndexBuildingFailureException:
>> Failed to build index for unexpected reason!
>>  at
>>
>> org.apache.phoenix.hbase.index.util.IndexManagementUtil.rethrowIndexingException(IndexManagementUtil.java:206)
>>  at
>> org.apache.phoenix.hbase.index.Indexer.preBatchMutate(Indexer.java:351)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$28.call(RegionCoprocessorHost.java:1010)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$28.call(RegionCoprocessorHost.java:1007)
>>  at
>>
>> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
>>  at
>>
>> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preBatchMutate(RegionCoprocessorHost.java:1007)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.prepareMiniBatchOperations(HRegion.java:3487)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3896)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3854)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3785)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1027)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:959)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:922)
>>  at
>>
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2666)
>>  at
>>
>> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42014)
>>  at
>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>>  at
>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>>  at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>>  at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>>  Caused by: java.lang.VerifyError: class
>> org.apache.phoenix.hbase.index.covered.data.IndexMemStore$1 overrides
>> final method
>> compare.(Lorg/apache/hadoop/hbase/Cell;Lorg/apache/hadoop/hbase/Cell;)I
>>  at java.lang.ClassLoader.defineClass1(Native Method)
>>  at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>>  at
>> java.security.SecureClassLoader.def

Re: Phoenix 5.0 could not commit transaction: org.apache.phoenix.execute.CommitException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: org.apache.phoenix.hbase

2018-09-25 Thread Jaanai Zhang
>
> org.apache.phoenix.hbase.index.covered.data.IndexMemStore$1 overrides
> final method
> compare.(Lorg/apache/hadoop/hbase/Cell;Lorg/apache/hadoop/hbase/Cell;)I
>  at java.lang.ClassLoader.defineClass1(Native Method)
>  at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>  at
>

It looks like that HBase's Jars are incompatible.

----
   Jaanai Zhang
   Best regards!



Francis Chuang  于2018年9月25日周二 下午8:06写道:

> Hi All,
>
> I recently updated one of my Go apps to use Phoenix 5.0 with HBase
> 2.0.2. I am using my Phoenix + HBase all in one docker image available
> here: https://github.com/Boostport/hbase-phoenix-all-in-one
>
> This is the log/output from the exception:
>
> RuntimeException: org.apache.phoenix.execute.CommitException:
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> Failed 1 action:
> org.apache.phoenix.hbase.index.builder.IndexBuildingFailureException:
> Failed to build index for unexpected reason!
>  at
>
> org.apache.phoenix.hbase.index.util.IndexManagementUtil.rethrowIndexingException(IndexManagementUtil.java:206)
>  at
> org.apache.phoenix.hbase.index.Indexer.preBatchMutate(Indexer.java:351)
>  at
>
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$28.call(RegionCoprocessorHost.java:1010)
>  at
>
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$28.call(RegionCoprocessorHost.java:1007)
>  at
>
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
>  at
>
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
>  at
>
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preBatchMutate(RegionCoprocessorHost.java:1007)
>  at
>
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.prepareMiniBatchOperations(HRegion.java:3487)
>  at
>
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3896)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3854)
>  at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3785)
>  at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1027)
>  at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:959)
>  at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:922)
>  at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2666)
>  at
>
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42014)
>  at
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>  at
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>  at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>  at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>  Caused by: java.lang.VerifyError: class
> org.apache.phoenix.hbase.index.covered.data.IndexMemStore$1 overrides
> final method
> compare.(Lorg/apache/hadoop/hbase/Cell;Lorg/apache/hadoop/hbase/Cell;)I
>  at java.lang.ClassLoader.defineClass1(Native Method)
>  at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>  at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>  at
> java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>  at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>  at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>  at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at
>
> org.apache.phoenix.hbase.index.covered.data.IndexMemStore.(IndexMemStore.java:82)
>  at
>
> org.apache.phoenix.hbase.index.covered.LocalTableState.(LocalTableState.java:57)
>  at
>
> org.apache.phoenix.h

Re: MutationState size is bigger than maximum allowed number of bytes

2018-09-19 Thread Jaanai Zhang
Are you configuring these on the server side?   Your “UPSERT SELECT”
grammar will be executed on the server side.


   Jaanai Zhang
   Best regards!



Batyrshin Alexander <0x62...@gmail.com> 于2018年9月20日周四 上午7:48写道:

> I've tried to copy one table to other via UPSERT SELECT construction and
> got this errors:
>
> Phoenix-4.14-hbase-1.4
>
> 0: jdbc:phoenix:> !autocommit on
> Autocommit status: true
> 0: jdbc:phoenix:>
> 0: jdbc:phoenix:> UPSERT INTO TABLE_V2 ("c", "id", "gt")
> . . . . . . . . > SELECT "c", "id", "gt" FROM TABLE;
> Error: ERROR 730 (LIM02): MutationState size is bigger than maximum allowed 
> number of bytes (state=LIM02,code=730)
> java.sql.SQLException: ERROR 730 (LIM02): MutationState size is bigger than 
> maximum allowed number of bytes
> at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
> at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
> at 
> org.apache.phoenix.execute.MutationState.throwIfTooBig(MutationState.java:377)
> at org.apache.phoenix.execute.MutationState.join(MutationState.java:478)
> at 
> org.apache.phoenix.compile.MutatingParallelIteratorFactory$1.close(MutatingParallelIteratorFactory.java:98)
> at 
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:104)
> at 
> org.apache.phoenix.iterate.ConcatResultIterator.peek(ConcatResultIterator.java:112)
> at 
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:100)
> at 
> org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
> at 
> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
> at org.apache.phoenix.trace.TracingIterator.next(TracingIterator.java:56)
> at 
> org.apache.phoenix.compile.UpsertCompiler$ClientUpsertSelectMutationPlan.execute(UpsertCompiler.java:1301)
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:389)
> at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
> at 
> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
> at sqlline.Commands.execute(Commands.java:822)
> at sqlline.Commands.sql(Commands.java:732)
> at sqlline.SqlLine.dispatch(SqlLine.java:813)
> at sqlline.SqlLine.begin(SqlLine.java:686)
> at sqlline.SqlLine.start(SqlLine.java:398)
> at sqlline.SqlLine.main(SqlLine.java:291)
>
>
> Config:
>
> 
> phoenix.mutate.batchSize
> 200
> 
> 
> phoenix.mutate.maxSize
> 25
> 
> 
> phoenix.mutate.maxSizeBytes
> 10485760
> 
>
>
> Also mentioned this at https://issues.apache.org/jira/browse/PHOENIX-4671
>


Re: Encountering IllegalStateException while querying Phoenix

2018-09-19 Thread Jaanai Zhang
Are you sure you had restarted RS process?  you can check
"phoenix-server.jar" whether exists in the classpath of  HBase by "jinfo"
command
----
   Jaanai Zhang
   Best regards!



William Shen  于2018年9月20日周四 上午6:01写道:

> For anyone else interested: we ended up identifying one of the RS actually
> failed to load the UngroupedAggregateRegionObserver because of a strange
> XML parsing issue that was not occurring prior to this incident and not
> happening on the any other RS.
>
> Failed to load coprocessor 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver
> java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: 
> jar:file:/opt/cloudera/parcels/CDH-5.9.2-1.cdh5.9.2.p0.3/jars/hadoop-common-2.6.0-cdh5.9.2.jar!/core-default.xml;
>  lineNumber: 196; columnNumber: 47; The string "--" is not permitted within 
> comments.
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2656)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2503)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2409)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:1144)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:1116)
>   at 
> org.apache.phoenix.util.PropertiesUtil.cloneConfig(PropertiesUtil.java:81)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.start(UngroupedAggregateRegionObserver.java:219)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$Environment.startup(CoprocessorHost.java:414)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:255)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:208)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:364)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.(RegionCoprocessorHost.java:226)
>   at org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:723)
>   at org.apache.hadoop.hbase.regionserver.HRegion.(HRegion.java:631)
>   at sun.reflect.GeneratedConstructorAccessor23.newInstance(Unknown 
> Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:6145)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6449)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6421)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6377)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6328)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:362)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.xml.sax.SAXParseException; systemId: 
> jar:file:/opt/cloudera/parcels/CDH-5.9.2-1.cdh5.9.2.p0.3/jars/hadoop-common-2.6.0-cdh5.9.2.jar!/core-default.xml;
>  lineNumber: 196; columnNumber: 47; The string "--" is not permitted within 
> comments.
>   at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>   at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>   at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2491)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2479)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2550)
>   ... 27 more
>
>
>
> On Wed, Sep 19, 2018 at 2:15 PM William Shen 
> wrote:
>
>> Hi there,
>>
>> I have encountered the following exception while trying to query from
>> Phoenix (was able to generate the exception doing a simple SELECT
>> count(1)). I have verified (MD5) that each region server has the correct
>> phoenix jars. Would appreciate any guidance on how to proceed further in
>> troubleshooting this (or what could've caused thi

Re: Salting based on partial rowkeys

2018-09-13 Thread Jaanai Zhang
Sorry, I don't understander your purpose. According to your proposal, it
seems that can't achieve.  You need a hash partition, However,  Some things
need to clarify that HBase is a range partition engine and the salt buckets
were used to avoid hotspot, in other words, HBase as a storage engine can't
support hash partition.


   Jaanai Zhang
   Best regards!



Gerald Sangudi  于2018年9月13日周四 下午11:32写道:

> Hi folks,
>
> Any thoughts or feedback on this?
>
> Thanks,
> Gerald
>
> On Mon, Sep 10, 2018 at 1:56 PM, Gerald Sangudi 
> wrote:
>
>> Hello folks,
>>
>> We have a requirement for salting based on partial, rather than full,
>> rowkeys. My colleague Mike Polcari has identified the requirement and
>> proposed an approach.
>>
>> I found an already-open JIRA ticket for the same issue:
>> https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide more
>> details from the proposal.
>>
>> The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mike
>> proposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .
>>
>> The benefit at issue is that users gain more control over partitioning,
>> and this can be used to push some additional aggregations and hash joins
>> down to region servers.
>>
>> I would appreciate any go-ahead / thoughts / guidance / objections /
>> feedback. I'd like to be sure that the concept at least is not
>> objectionable. We would like to work on this and submit a patch down the
>> road. I'll also add a note to the JIRA ticket.
>>
>> Thanks,
>> Gerald
>>
>>
>


Re: Missing content in phoenix after writing from Spark

2018-09-12 Thread Jaanai Zhang
It seems columns data missing mapping information of the schema. if you
want to use this way to write HBase table,  you can create an HBase table
and uses Phoenix mapping it.


   Jaanai Zhang
   Best regards!



Thomas D'Silva  于2018年9月13日周四 上午6:03写道:

> Is there a reason you didn't use the spark-connector to serialize your
> data?
>
> On Wed, Sep 12, 2018 at 2:28 PM, Saif Addin  wrote:
>
>> Thank you Josh! That was helpful. Indeed, there was a salt bucket on the
>> table, and the key-column now shows correctly.
>>
>> However, the problem still persists in that the rest of the columns show
>> as completely empty on Phoenix (appear correctly on Hbase). We'll be
>> looking into this but if you have any further advice, appreciated.
>>
>> Saif
>>
>> On Wed, Sep 12, 2018 at 5:50 PM Josh Elser  wrote:
>>
>>> Reminder: Using Phoenix internals forces you to understand exactly how
>>> the version of Phoenix that you're using serializes data. Is there a
>>> reason you're not using SQL to interact with Phoenix?
>>>
>>> Sounds to me that Phoenix is expecting more data at the head of your
>>> rowkey. Maybe a salt bucket that you've defined on the table but not
>>> created?
>>>
>>> On 9/12/18 4:32 PM, Saif Addin wrote:
>>> > Hi all,
>>> >
>>> > We're trying to write tables with all string columns from spark.
>>> > We are not using the Spark Connector, instead we are directly writing
>>> > byte arrays from RDDs.
>>> >
>>> > The process works fine, and Hbase receives the data correctly, and
>>> > content is consistent.
>>> >
>>> > However reading the table from Phoenix, we notice the first character
>>> of
>>> > strings are missing. This sounds like it's a byte encoding issue, but
>>> > we're at loss. We're using PVarchar to generate bytes.
>>> >
>>> > Here's the snippet of code creating the RDD:
>>> >
>>> > val tdd = pdd.flatMap(x => {
>>> >val rowKey = PVarchar.INSTANCE.toBytes(x._1)
>>> >for(i <- 0 until cols.length) yield {
>>> >  other stuff for other columns ...
>>> >  ...
>>> >  (rowKey, (column1, column2, column3))
>>> >}
>>> > })
>>> >
>>> > ...
>>> >
>>> > We then create the following output to be written down in Hbase
>>> >
>>> > val output = tdd.map(x => {
>>> >  val rowKeyByte: Array[Byte] = x._1
>>> >  val immutableRowKey = new ImmutableBytesWritable(rowKeyByte)
>>> >
>>> >  val kv = new KeyValue(rowKeyByte,
>>> >  PVarchar.INSTANCE.toBytes(column1),
>>> >  PVarchar.INSTANCE.toBytes(column2),
>>> >PVarchar.INSTANCE.toBytes(column3)
>>> >  )
>>> >  (immutableRowKey, kv)
>>> > })
>>> >
>>> > By the way, we are using *KryoSerializer* in order to be able to
>>> > serialize all classes necessary for Hbase (KeyValue, BytesWritable,
>>> etc).
>>> >
>>> > The key of this table is the one missing data when queried from
>>> Phoenix.
>>> > So we guess something is wrong with the byte ser.
>>> >
>>> > Any ideas? Appreciated!
>>> > Saif
>>>
>>
>


Re: ABORTING region server and following HBase cluster "crash"

2018-09-10 Thread Jaanai Zhang
The root cause could not be got from log information lastly. The index
might have been corrupted and it seems the action of aborting server still
continue due to Index handler failures policy.


   Yun Zhang
   Best regards!



Batyrshin Alexander <0x62...@gmail.com> 于2018年9月10日周一 上午3:46写道:

> Correct me if im wrong.
>
> But looks like if you have A and B region server that has index and
> primary table then possible situation like this.
>
> A and B under writes on table with indexes
> A - crash
> B failed on index update because A is not operating then B starting
> aborting
> A after restart try to rebuild index from WAL but B at this time is
> aborting then A starting aborting too
> From this moment nothing happens (0 requests to region servers) and A and
> B is not responsible from Master-status web interface
>
>
> On 9 Sep 2018, at 04:38, Batyrshin Alexander <0x62...@gmail.com> wrote:
>
> After update we still can't recover HBase cluster. Our region servers
> ABORTING over and over:
>
> prod003:
> Sep 09 02:51:27 prod003 hbase[1440]: 2018-09-09 02:51:27,395 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=92,queue=2,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536446665703: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:51:27 prod003 hbase[1440]: 2018-09-09 02:51:27,395 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=77,queue=7,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536446665703: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:52:19 prod003 hbase[1440]: 2018-09-09 02:52:19,224 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=82,queue=2,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536446665703: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:52:28 prod003 hbase[1440]: 2018-09-09 02:52:28,922 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=94,queue=4,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536446665703: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:55:02 prod003 hbase[957]: 2018-09-09 02:55:02,096 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=95,queue=5,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536450772841: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:55:18 prod003 hbase[957]: 2018-09-09 02:55:18,793 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=97,queue=7,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod003,60020,1536450772841: Could not update the index table,
> killing server region because couldn't write to an index table
>
> prod004:
> Sep 09 02:52:13 prod004 hbase[4890]: 2018-09-09 02:52:13,541 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=83,queue=3,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod004,60020,1536446387325: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:52:50 prod004 hbase[4890]: 2018-09-09 02:52:50,264 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=75,queue=5,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod004,60020,1536446387325: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:53:40 prod004 hbase[4890]: 2018-09-09 02:53:40,709 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=66,queue=6,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod004,60020,1536446387325: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:54:00 prod004 hbase[4890]: 2018-09-09 02:54:00,060 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=89,queue=9,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod004,60020,1536446387325: Could not update the index table,
> killing server region because couldn't write to an index table
>
> prod005:
> Sep 09 02:52:50 prod005 hbase[3772]: 2018-09-09 02:52:50,661 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=65,queue=5,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod005,60020,153644649: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:53:27 prod005 hbase[3772]: 2018-09-09 02:53:27,542 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=90,queue=0,port=60020]
> regionserver.HRegionServer: ABORTING region
> server prod005,60020,153644649: Could not update the index table,
> killing server region because couldn't write to an index table
> Sep 09 02:54:00 prod005 hbase[3772]: 2018-09-09 02:53:59,915 FATAL
> [RpcServer.default.FPBQ.Fifo.handler=7,queue=7,port=60020]
> 

Re: Phoenix CsvBulkLoadTool fails with java.sql.SQLException: ERROR 103 (08004): Unable to establish connection

2018-08-21 Thread Jaanai Zhang
Caused by: java.lang.IllegalAccessError: class
org.apache.hadoop.hdfs.web.HftpFileSystem
cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$
TokenManagementDelegator

This is the root cause,  it seems that HBase 1.2 can't access
interface of Hadoop
3.1, so you should consider degrading  Hadoop's version or upgrading
HBase's version.



   Yun Zhang
   Best regards!


2018-08-21 11:28 GMT+08:00 Mich Talebzadeh :

> Hi,
>
> The Hadoop version is Hadoop 3.1.0. Hbase is 1.2.6 and Phoenix is
> apache-phoenix-4.8.1-HBase-1.2-bin
>
> In the past I had issues with Hbase 2 working with Hadoop 3.1 so I had to
> use Hbase 1.2.6. The individual components work fine. In other words I can
> do all operations on Hbase with Hadoop 3.1 and Phoenix.
>
> The issue I am facing is using both  org.apache.phoenix.mapreduce.
> CsvBulkLoadTool and hbase.mapreduce.ImportTsv utilities.
>
> So I presume the issue may be to do with both these command line tools not
> working with Hadoop 3.1?
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 21 Aug 2018 at 00:48, Sergey Soldatov 
> wrote:
>
>> If I read it correctly you are trying to use Phoenix and HBase that were
>> built against Hadoop 2 with Hadoop 3. Is HBase was the only component you
>> have upgraded?
>>
>> Thanks,
>> Sergey
>>
>> On Mon, Aug 20, 2018 at 1:42 PM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Here you go
>>>
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:java.library.path=/home/hduser/hadoop-3.1.0/lib
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:java.io.tmpdir=/tmp
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:java.compiler=
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:os.name=Linux
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:os.arch=amd64
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:os.version=3.10.0-862.3.2.el7.x86_64
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:user.name=hduser
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:user.home=/home/hduser
>>> 2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client
>>> environment:user.dir=/data6/hduser/streaming_data/2018-08-20
>>> 2018-08-20 18:29:47,249 INFO  [main] zookeeper.ZooKeeper: Initiating
>>> client connection, connectString=rhes75:2181 sessionTimeout=9
>>> watcher=hconnection-0x493d44230x0, quorum=rhes75:2181, baseZNode=/hbase
>>> 2018-08-20 18:29:47,261 INFO  [main-SendThread(rhes75:2181)]
>>> zookeeper.ClientCnxn: Opening socket connection to server rhes75/
>>> 50.140.197.220:2181. Will not attempt to authenticate using SASL
>>> (unknown error)
>>> 2018-08-20 18:29:47,264 INFO  [main-SendThread(rhes75:2181)]
>>> zookeeper.ClientCnxn: Socket connection established to rhes75/
>>> 50.140.197.220:2181, initiating session
>>> 2018-08-20 18:29:47,281 INFO  [main-SendThread(rhes75:2181)]
>>> zookeeper.ClientCnxn: Session establishment complete on server rhes75/
>>> 50.140.197.220:2181, sessionid = 0x1002ea99eed0077, negotiated timeout
>>> = 4
>>> Exception in thread "main" java.sql.SQLException: ERROR 103 (08004):
>>> Unable to establish connection.
>>> at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.
>>> newException(SQLExceptionCode.java:455)
>>> at org.apache.phoenix.exception.SQLExceptionInfo.buildException(
>>> SQLExceptionInfo.java:145)
>>> at org.apache.phoenix.query.ConnectionQueryServicesImpl.
>>> openConnection(ConnectionQueryServicesImpl.java:386)
>>> at org.apache.phoenix.query.ConnectionQueryServicesImpl.
>>> access$300(ConnectionQueryServicesImpl.java:222)
>>> at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(
>>> ConnectionQueryServicesImpl.java:2318)
>>> at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(
>>> ConnectionQueryServicesImpl.java:2294)
>>> at org.apache.phoenix.util.PhoenixContextExecutor.call(
>>> PhoenixContextExecutor.java:76)
>>> at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(
>>> ConnectionQueryServicesImpl.java:2294)
>>> at org.apache.phoenix.jdbc.PhoenixDriver.
>>> 

Re: Re: error when using apache-phoenix-4.14.0-HBase-1.2-bin with hbase 1.2.6

2018-08-06 Thread Jaanai Zhang
reference link: http://phoenix.apache.org/installation.html



   Yun Zhang
   Best regards!


2018-08-07 9:30 GMT+08:00 倪项菲 :

> Hi Zhang Yun,
> how to deploy the Phoenix server?I just have the infomation from
> phoenix website,it doesn't mention the phoenix server
>
>
>
>
> 发件人: Jaanai Zhang 
> 时间: 2018/08/07(星期二)09:16
> 收件人: user ;
> 主题: Re: error when using apache-phoenix-4.14.0-HBase-1.2-bin with hbase
> 1.2.6
>
> Please ensure your Phoenix server was deployed and had resarted
>
>
> 
>Yun Zhang
>Best regards!
>
>
> 2018-08-07 9:10 GMT+08:00 倪项菲 :
>
>>
>> Hi Experts,
>> I am using HBase 1.2.6,the cluster is working good with HMaster
>> HA,but when we integrate phoenix with hbase,it failed,below are the steps
>> 1,download apache-phoenix-4.14.0-HBase-1.2-bin from
>> http://phoenix.apache.org,the copy the tar file to the HMaster and unzip
>> the file
>> 2,copy phoenix-core-4.14.0-HBase-1.2.jar 
>> phoenix-4.14.0-HBase-1.2-server.jar
>> to all HBase nodes including HMaster and HRegionServer ,put them to
>> hbasehome/lib,my path is /opt/hbase-1.2.6/lib
>> 3,restart hbase cluster
>> 4,then start to use phoenix,but it return below error:
>>   [apache@plat-ecloud01-bigdata-journalnode01 bin]$ ./sqlline.py
>> plat-ecloud01-bigdata-zk01,plat-ecloud01-bigdata-zk02,plat-e
>> cloud01-bigdata-zk03
>> Setting property: [incremental, false]
>> Setting property: [isolation, TRANSACTION_READ_COMMITTED]
>> issuing: !connect jdbc:phoenix:plat-ecloud01-bigdata-zk01 none none
>> org.apache.phoenix.jdbc.PhoenixDriver
>> Connecting to jdbc:phoenix:plat-ecloud01-bigdata-zk01,
>> plat-ecloud01-bigdata-zk02,plat-ecloud01-bigdata-zk03
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/opt/apache-phoenix-
>> 4.14.0-HBase-1.2-bin/phoenix-4.14.0-HBase-1.2-client.jar!/or
>> g/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.6/sh
>> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im
>> pl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> 18/08/06 18:40:08 WARN util.NativeCodeLoader: Unable to load
>> native-hadoop library for your platform... using builtin-java classes where
>> applicable
>> Error: org.apache.hadoop.hbase.DoNotRetryIOException: Unable to load
>> configured region split policy 
>> 'org.apache.phoenix.schema.MetaDataSplitPolicy'
>> for table 'SYSTEM.CATALOG' Set hbase.table.sanity.checks to false at conf
>> or table descriptor if you want to bypass sanity checks
>> at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionF
>> orFailure(HMaster.java:1754)
>> at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescr
>> iptor(HMaster.java:1615)
>> at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.
>> java:1541)
>> at org.apache.hadoop.hbase.master.MasterRpcServices.createTable
>> (MasterRpcServices.java:463)
>> at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$
>> MasterService$2.callBlockingMethod(MasterProtos.java:55682)
>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:21
>> 96)
>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:
>> 112)
>> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExec
>> utor.java:133)
>> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.ja
>> va:108)
>> at java.lang.Thread.run(Thread.java:745) (state=08000,code=101)
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException: Unable to load configured
>> region split policy 'org.apache.phoenix.schema.MetaDataSplitPolicy' for
>> table 'SYSTEM.CATALOG' Set hbase.table.sanity.checks to false at conf or
>> table descriptor if you want to bypass sanity checks
>> at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionF
>> orFailure(HMaster.java:1754)
>> at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescr
>> iptor(HMaster.java:1615)
>> at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.
>> java:1541)
>> at org.apache.hadoop.hbase.master.MasterRpcServices.createTable
>> (MasterRpcServices.java:463)
>> at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$
>> MasterService$2.callBlockingMethod(MasterPro

Re: error when using apache-phoenix-4.14.0-HBase-1.2-bin with hbase 1.2.6

2018-08-06 Thread Jaanai Zhang
Please ensure your Phoenix server was deployed and had resarted



   Yun Zhang
   Best regards!


2018-08-07 9:10 GMT+08:00 倪项菲 :

>
> Hi Experts,
> I am using HBase 1.2.6,the cluster is working good with HMaster HA,but
> when we integrate phoenix with hbase,it failed,below are the steps
> 1,download apache-phoenix-4.14.0-HBase-1.2-bin from
> http://phoenix.apache.org,the copy the tar file to the HMaster and unzip
> the file
> 2,copy phoenix-core-4.14.0-HBase-1.2.jar 
> phoenix-4.14.0-HBase-1.2-server.jar
> to all HBase nodes including HMaster and HRegionServer ,put them to
> hbasehome/lib,my path is /opt/hbase-1.2.6/lib
> 3,restart hbase cluster
> 4,then start to use phoenix,but it return below error:
>   [apache@plat-ecloud01-bigdata-journalnode01 bin]$ ./sqlline.py
> plat-ecloud01-bigdata-zk01,plat-ecloud01-bigdata-zk02,plat-
> ecloud01-bigdata-zk03
> Setting property: [incremental, false]
> Setting property: [isolation, TRANSACTION_READ_COMMITTED]
> issuing: !connect jdbc:phoenix:plat-ecloud01-bigdata-zk01 none none
> org.apache.phoenix.jdbc.PhoenixDriver
> Connecting to jdbc:phoenix:plat-ecloud01-bigdata-zk01,plat-ecloud01-
> bigdata-zk02,plat-ecloud01-bigdata-zk03
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/apache-phoenix-
> 4.14.0-HBase-1.2-bin/phoenix-4.14.0-HBase-1.2-client.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.6/
> share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/
> impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> 18/08/06 18:40:08 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Error: org.apache.hadoop.hbase.DoNotRetryIOException: Unable to load
> configured region split policy 'org.apache.phoenix.schema.MetaDataSplitPolicy'
> for table 'SYSTEM.CATALOG' Set hbase.table.sanity.checks to false at conf
> or table descriptor if you want to bypass sanity checks
> at org.apache.hadoop.hbase.master.HMaster.
> warnOrThrowExceptionForFailure(HMaster.java:1754)
> at org.apache.hadoop.hbase.master.HMaster.
> sanityCheckTableDescriptor(HMaster.java:1615)
> at org.apache.hadoop.hbase.master.HMaster.createTable(
> HMaster.java:1541)
> at org.apache.hadoop.hbase.master.MasterRpcServices.
> createTable(MasterRpcServices.java:463)
> at org.apache.hadoop.hbase.protobuf.generated.
> MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55682)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:108)
> at java.lang.Thread.run(Thread.java:745) (state=08000,code=101)
> org.apache.phoenix.exception.PhoenixIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException:
> Unable to load configured region split policy 
> 'org.apache.phoenix.schema.MetaDataSplitPolicy'
> for table 'SYSTEM.CATALOG' Set hbase.table.sanity.checks to false at conf
> or table descriptor if you want to bypass sanity checks
> at org.apache.hadoop.hbase.master.HMaster.
> warnOrThrowExceptionForFailure(HMaster.java:1754)
> at org.apache.hadoop.hbase.master.HMaster.
> sanityCheckTableDescriptor(HMaster.java:1615)
> at org.apache.hadoop.hbase.master.HMaster.createTable(
> HMaster.java:1541)
> at org.apache.hadoop.hbase.master.MasterRpcServices.
> createTable(MasterRpcServices.java:463)
> at org.apache.hadoop.hbase.protobuf.generated.
> MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55682)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:108)
> at java.lang.Thread.run(Thread.java:745)
>
> at org.apache.phoenix.util.ServerUtil.parseServerException(
> ServerUtil.java:144)
> at org.apache.phoenix.query.ConnectionQueryServicesImpl.
> ensureTableCreated(ConnectionQueryServicesImpl.java:1197)
> at org.apache.phoenix.query.ConnectionQueryServicesImpl.
> createTable(ConnectionQueryServicesImpl.java:1491)
> at org.apache.phoenix.schema.MetaDataClient.createTableInternal(
> MetaDataClient.java:2717)
> at org.apache.phoenix.schema.MetaDataClient.createTable(
> MetaDataClient.java:1114)
> at org.apache.phoenix.compile.CreateTableCompiler$1.execute(
> CreateTableCompiler.java:192)
> at 

Re: Spark-Phoenix Plugin

2018-08-06 Thread Jaanai Zhang
you can get better performance if directly read/write HBase. you also use
spark-phoenix, this is an example, reading data from CSV file and writing
into Phoenix table:

def main(args: Array[String]): Unit = {

  val sc = new SparkContext("local", "phoenix-test")
  val path = "/tmp/data"
  val hbaseConnectionString = "host1,host2,host3"
  val customSchema = StructType(Array(
StructField("O_ORDERKEY", StringType, true),
StructField("O_CUSTKEY", StringType, true),
StructField("O_ORDERSTATUS", StringType, true),
StructField("O_TOTALPRICE", StringType, true),
StructField("O_ORDERDATE", StringType, true),
StructField("O_ORDERPRIORITY", StringType, true),
StructField("O_CLERK", StringType, true),
StructField("O_SHIPPRIORITY", StringType, true),
StructField("O_COMMENT", StringType, true)))

  //import com.databricks.spark.csv._
  val sqlContext = new SQLContext(sc)

  val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("delimiter", "|")
.option("header", "false")
.schema(customSchema)
.load(path)

  val start = System.currentTimeMillis()
  df.write.format("org.apache.phoenix.spark")
.mode("overwrite")
.option("table", "DATAX")
.option("zkUrl", hbaseConnectionString)
.save()

  val end = System.currentTimeMillis()
  print("taken time:" + ((end - start) / 1000) + "s")
}





   Yun Zhang
   Best regards!


2018-08-06 20:10 GMT+08:00 Brandon Geise :

> Thanks for the reply Yun.
>
>
>
> I’m not quite clear how this would exactly help on the upsert side?  Are
> you suggesting deriving the type from Phoenix then doing the
> encoding/decoding and writing/reading directly from HBase?
>
>
>
> Thanks,
>
> Brandon
>
>
>
> *From: *Jaanai Zhang 
> *Reply-To: *
> *Date: *Sunday, August 5, 2018 at 9:34 PM
> *To: *
> *Subject: *Re: Spark-Phoenix Plugin
>
>
>
> You can get data type from Phoenix meta, then encode/decode data to
> write/read data. I think this way is effective, FYI :)
>
>
>
>
> 
>
>Yun Zhang
>
>Best regards!
>
>
>
>
>
> 2018-08-04 21:43 GMT+08:00 Brandon Geise :
>
> Good morning,
>
>
>
> I’m looking at using a combination of Hbase, Phoenix and Spark for a
> project and read that using the Spark-Phoenix plugin directly is more
> efficient than JDBC, however it wasn’t entirely clear from examples when
> writing a dataframe if an upsert is performed and how much fine-grained
> options there are for executing the upsert.  Any information someone can
> share would be greatly appreciated!
>
>
>
>
>
> Thanks,
>
> Brandon
>
>
>


Re: Spark-Phoenix Plugin

2018-08-05 Thread Jaanai Zhang
You can get data type from Phoenix meta, then encode/decode data to
write/read data. I think this way is effective, FYI :)



   Yun Zhang
   Best regards!


2018-08-04 21:43 GMT+08:00 Brandon Geise :

> Good morning,
>
>
>
> I’m looking at using a combination of Hbase, Phoenix and Spark for a
> project and read that using the Spark-Phoenix plugin directly is more
> efficient than JDBC, however it wasn’t entirely clear from examples when
> writing a dataframe if an upsert is performed and how much fine-grained
> options there are for executing the upsert.  Any information someone can
> share would be greatly appreciated!
>
>
>
>
>
> Thanks,
>
> Brandon
>


Re: Row Scan In HBase Not Working When Table Created With Phoenix

2018-07-29 Thread Jaanai Zhang
You must use schema to encode data if you want to use HBASE API, that means
you need to use some Phoenix code.  this way is not recommended if you are
not developer, you can use SQL that is more convenient.



   Yun Zhang
   Best regards!


2018-07-29 1:41 GMT+08:00 anil gupta :

> In addition to Miles comment, its recommended to use Phoenix for reads if
> you wrote data using Phoenix.
> To mimic your range scan in Phoenix query: select * from EMPPH2 where id
> >= 1 and id < 3;
>
> On Thu, Jul 26, 2018 at 11:10 PM, Miles Spielberg  wrote:
>
>> SQL INTEGER is not stored as strings, but as 4-bytes of encoded binary.
>> See https://phoenix.apache.org/language/datatypes.html#integer_type
>>
>> Miles Spielberg
>> Staff Software Engineer
>>
>>
>> O. 650.485.1102
>> 900 Jefferson Ave
>> 
>> Redwood City, CA 94063
>> 
>>
>> On Thu, Jul 26, 2018 at 5:44 PM, alchemist > > wrote:
>>
>>> create table empPh2(id integer primary key, fname varchar, lname
>>> varchar)COLUMN_ENCODED_BYTES=0
>>> upsert into empPh2 values (1, 'A', 'B');
>>> upsert into empPh2 values (2, 'B', 'B');
>>> upsert into empPh2 values (3, 'C', 'B');
>>> upsert into empPh2 values (4, 'John', 'B');
>>>
>>>
>>> Then when to HBase to do the range query using following command:
>>>
>>> hbase(main):004:0> scan 'EMPPH2', {STARTROW => '1', ENDROW => '3'}
>>> ROWCOLUMN+CELL
>>>
>>>
>>> 0 row(s) in 0.0030 seconds
>>>
>>> I saw row in HBASE has extra symbols.  Not sure how to have 1:1 mapping
>>> between HBASE table to Phoenix table.
>>>
>>> ROW  COLUMN+CELL
>>>
>>>
>>>  \x80\x00\x00\x01column=0:FNAME,
>>> timestamp=1532651140732, value=A
>>>
>>>  \x80\x00\x00\x01column=0:LNAME,
>>> timestamp=1532651140732, value=B
>>>
>>>  \x80\x00\x00\x01column=0:_0,
>>> timestamp=1532651140732, value=x
>>>
>>>  \x80\x00\x00\x02column=0:FNAME,
>>> timestamp=1532651151877, value=B
>>>
>>>  \x80\x00\x00\x02column=0:LNAME,
>>> timestamp=1532651151877, value=B
>>>
>>>  \x80\x00\x00\x02column=0:_0,
>>> timestamp=1532651151877, value=x
>>>
>>>  \x80\x00\x00\x03column=0:FNAME,
>>> timestamp=1532651164899, value=C
>>>
>>>  \x80\x00\x00\x03column=0:LNAME,
>>> timestamp=1532651164899, value=B
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/
>>>
>>
>>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>


Re: How to support UPSERT with WHERE clause

2018-07-16 Thread Jaanai Zhang
Now Phoenix don not support where clause within upsert. I also think this
function is very important that use frequency is  highly.

Maybe the dialect looks like this:
Upset into  scheme.table_name set col=‘x’ where id=‘x’

This semantic to implement is easy in Phoenix, we just need once write rpc
if where condition can hit rowKey


Re: How to run Phoenix Secondary Index Coprocessor with Hbase?

2018-07-12 Thread Jaanai Zhang
I only use Phoenix API(JDBC API) to access HBASE If you want to use
secondary indexes.



   Yun Zhang
   Best regards!


2018-07-12 20:08 GMT+08:00 alchemist :

> I tried using Phoenix JDBC API to access data in a remote EMR server from
> another EC2 machine.  I tried multithreading the program but it is not
> scaling.I am getting 1 transaction per second.  This seems extremely slow.
> So I thought If I can use coprocessor written for Secondary Index by
> Phoenix
> using Hadoop HBase API then I can solve this problem.
>
> I tried creating secondary index using Phoenix and tried inserting the data
> using HBase put but it has not added data into the secondary index table. I
> am just wondering if there is any setting that I am missing,   I want to
> use
> Phoenix coprocessor written to manage secondary indexes with Hadoop HBase
> API.
>
>
>
> --
> Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/
>


Re: AW: Duplicate Records Showing in Apache Phoenix

2018-06-20 Thread Jaanai Zhang
May some fields got incorrectly reflect after upgrading from 4.8 to 4.12,
so it could not print all selected data.



   Yun Zhang
   Best regards!


2018-06-18 16:41 GMT+08:00 Azharuddin Shaikh :

> Hi,
>
> We have upgraded the phoenix version to 4.12 from 4.8 but now we are facing
> an issue while performing table load using hbase import table utility.
>
> Data is getting imported but when we are performing any select operation on
> the loaded table it is not reflecting any records, only columns are getting
> printed.
>
> We are facing this issue only after performing phoenix version upgrade from
> 4.8  to 4.12.
>
>
>
> --
> Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/
>


Re: Null array elements with joins

2018-06-19 Thread Jaanai Zhang
what's your Phoenix's version?



   Yun Zhang
   Best regards!


2018-06-20 1:02 GMT+08:00 Tulasi Paradarami :

> Hi,
>
> I'm running few tests against Phoenix array and running into this bug
> where array elements return null values when a join is involved. Is this a
> known issue/limitation of arrays?
>
> create table array_test_1 (id integer not null primary key, arr
> tinyint[5]);
> upsert into array_test_1 values (1001, array[0, 0, 0, 0, 0]);
> upsert into array_test_1 values (1002, array[0, 0, 0, 0, 1]);
> upsert into array_test_1 values (1003, array[0, 0, 0, 1, 1]);
> upsert into array_test_1 values (1004, array[0, 0, 1, 1, 1]);
> upsert into array_test_1 values (1005, array[1, 1, 1, 1, 1]);
>
> create table test_table_1 (id integer not null primary key, val varchar);
> upsert into test_table_1 values (1001, 'abc');
> upsert into test_table_1 values (1002, 'def');
> upsert into test_table_1 values (1003, 'ghi');
>
> 0: jdbc:phoenix:localhost> select t1.id, t2.val, t1.arr[1], t1.arr[2],
> t1.arr[3] from array_test_1 as t1 join test_table_1 as t2 on t1.id = t2.id
> ;
> ++-++---
> -++
> | T1.ID  | T2.VAL  | ARRAY_ELEM(T1.ARR, 1)  | ARRAY_ELEM(T1.ARR, 2)  |
> ARRAY_ELEM(T1.ARR, 3)  |
> ++-++---
> -++
> | 1001   | abc | null   | null   |
> null   |
> | 1002   | def | null   | null   |
> null   |
> | 1003   | ghi | null   | null   |
> null   |
> ++-++---
> -++
> 3 rows selected (0.056 seconds)
>
> However, directly selecting array elements from the array returns data
> correctly.
> 0: jdbc:phoenix:localhost> select t1.id, t1.arr[1], t1.arr[2], t1.arr[3]
> from array_test_1 as t1;
> +---+-+-+---
> --+
> |  ID   | ARRAY_ELEM(ARR, 1)  | ARRAY_ELEM(ARR, 2)  | ARRAY_ELEM(ARR, 3)  |
> +---+-+-+---
> --+
> | 1001  | 0   | 0   | 0   |
> | 1002  | 0   | 0   | 0   |
> | 1003  | 0   | 0   | 0   |
> | 1004  | 0   | 0   | 1   |
> | 1005  | 1   | 1   | 1   |
> +---+-+-+---
> --+
> 5 rows selected (0.044 seconds)
>
>
>