Re: Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-25 Thread Batyrshin Alexander
Yep, our test HBase cluster was misconfigured (ntpd was disabled). After time 
synchronisation I don't observe any delay between shell put and Phoenix select.

> On 25 Aug 2017, at 20:54, James Taylor  wrote:
> 
> Phoenix retrieves the server timestamp from the region server that hosts the 
> system catalog table and uses that as the timestamp of the puts when you do 
> an UPSERT VALUE (FYI, this behavior will change in 4.12 and we'll use latest 
> timestamp everywhere). I suspect the puts you're doing are going to a 
> different region server and the clocks on the servers in your cluster are not 
> synchronized.
> 
> If that's the case, the best option is to make sure your clocks are 
> synchronized as that'll prevent other weird, unexpected behavior. If that's 
> not an option one workaround would be to set the CURRENT_SCN property on your 
> connection to HConstants.LATEST_TIMESTAMP like this:
> 
> props.put(PhoenixRuntime.CURRENT_SCN_ATTRIB, 
> Long.toString(HContants.LATEST_TIMESTAMP));
> conn = DriverManager.getConnection(getUrl(), props);
> 
> 
> 
> 
> On Fri, Aug 25, 2017 at 10:14 AM, Batyrshin Alexander <0x62...@gmail.com 
> > wrote:
> Its coming from scan time ragne. If i run sqlline with 'currentSCN' from the 
> future then select retrieve fresh data immediatly. 
> 
> We already have software that write with HBase API. Now we build client that 
> works with data from HBase via Phoenix.
> 
> 
> 
>> On 25 Aug 2017, at 19:35, Josh Elser > > wrote:
>> 
>> Calls to put in the HBase shell, to the best of my knowledge, are 
>> synchronous. You should not have control returned to you until the update 
>> was committed by the RegionServers. HBase's data guarantees are that once a 
>> call to write data returns to you, all other readers *must* be able to see 
>> that update.
>> 
>> I'm not sure where this 3-5 second delay you describe is coming form.
>> 
>> Regardless, why are you writing data to HBase directly and circumventing the 
>> APIs to write data via Phoenix? If you want to access your data via Phoenix, 
>> you're going to run into less pain if you work completely at the Phoenix API 
>> level, (tl;dr use UPSERT to write data)
>> 
>> On 8/24/17 2:58 PM, Batyrshin Alexander wrote:
>>> Here is example:
>>> CREATE TABLE IF NOT EXISTS test (
>>>   k VARCHAR NOT NULL,
>>>   v VARCHAR,
>>>   CONSTRAINT my_pk PRIMARY KEY (k)
>>> );
>>> 0: jdbc:phoenix:> upsert into test(k,v) values ('1', 'a');
>>> 1 row affected (0.042 seconds)
>>> 0: jdbc:phoenix:> select * from test;
>>> +++
>>> | K  | V  |
>>> +++
>>> | 1  | a  |
>>> +++
>>> Then:
>>> hbase(main):014:0> put 'TEST', '1', '0:V', 'b'
>>> 0 row(s) in 0.0100 seconds
>>> Result in phoenix will be available after ~ 3-5 seconds:
>>> 0: jdbc:phoenix:> select * from test;
>>> +++
>>> | K  | V  |
>>> +++
>>> | 1  | a  |
>>> +++
>>> 1 row selected (0.015 seconds)
>>> ... 5 seconds later
>>> 0: jdbc:phoenix:> select * from test;
>>> +++
>>> | K  | V  |
>>> +++
>>> | 1  | b  |
>>> +++
>>> 1 row selected (0.026 seconds)
 On 24 Aug 2017, at 21:38, Batyrshin Alexander <0x62...@gmail.com 
  >> wrote:
 
  Hello,
 
 How to decrease or even eliminate delay between direct HBase put (for 
 example from HBase shell) and SELECT from Phoenix?
 
 My table has only 1 VERSION and do not use any block cache ( {NAME => 
 'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not 
 understand where previous value for SELECT come from.
> 
> 



Re: Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-25 Thread Batyrshin Alexander
I've already tested currentSCN and I can confirm that delays are gone.

Going to check clocks on cluster nodes...


> On 25 Aug 2017, at 20:54, James Taylor  wrote:
> 
> Phoenix retrieves the server timestamp from the region server that hosts the 
> system catalog table and uses that as the timestamp of the puts when you do 
> an UPSERT VALUE (FYI, this behavior will change in 4.12 and we'll use latest 
> timestamp everywhere). I suspect the puts you're doing are going to a 
> different region server and the clocks on the servers in your cluster are not 
> synchronized.
> 
> If that's the case, the best option is to make sure your clocks are 
> synchronized as that'll prevent other weird, unexpected behavior. If that's 
> not an option one workaround would be to set the CURRENT_SCN property on your 
> connection to HConstants.LATEST_TIMESTAMP like this:
> 
> props.put(PhoenixRuntime.CURRENT_SCN_ATTRIB, 
> Long.toString(HContants.LATEST_TIMESTAMP));
> conn = DriverManager.getConnection(getUrl(), props);
> 
> 
> 
> 
> On Fri, Aug 25, 2017 at 10:14 AM, Batyrshin Alexander <0x62...@gmail.com 
> > wrote:
> Its coming from scan time ragne. If i run sqlline with 'currentSCN' from the 
> future then select retrieve fresh data immediatly. 
> 
> We already have software that write with HBase API. Now we build client that 
> works with data from HBase via Phoenix.
> 
> 
> 
>> On 25 Aug 2017, at 19:35, Josh Elser > > wrote:
>> 
>> Calls to put in the HBase shell, to the best of my knowledge, are 
>> synchronous. You should not have control returned to you until the update 
>> was committed by the RegionServers. HBase's data guarantees are that once a 
>> call to write data returns to you, all other readers *must* be able to see 
>> that update.
>> 
>> I'm not sure where this 3-5 second delay you describe is coming form.
>> 
>> Regardless, why are you writing data to HBase directly and circumventing the 
>> APIs to write data via Phoenix? If you want to access your data via Phoenix, 
>> you're going to run into less pain if you work completely at the Phoenix API 
>> level, (tl;dr use UPSERT to write data)
>> 
>> On 8/24/17 2:58 PM, Batyrshin Alexander wrote:
>>> Here is example:
>>> CREATE TABLE IF NOT EXISTS test (
>>>   k VARCHAR NOT NULL,
>>>   v VARCHAR,
>>>   CONSTRAINT my_pk PRIMARY KEY (k)
>>> );
>>> 0: jdbc:phoenix:> upsert into test(k,v) values ('1', 'a');
>>> 1 row affected (0.042 seconds)
>>> 0: jdbc:phoenix:> select * from test;
>>> +++
>>> | K  | V  |
>>> +++
>>> | 1  | a  |
>>> +++
>>> Then:
>>> hbase(main):014:0> put 'TEST', '1', '0:V', 'b'
>>> 0 row(s) in 0.0100 seconds
>>> Result in phoenix will be available after ~ 3-5 seconds:
>>> 0: jdbc:phoenix:> select * from test;
>>> +++
>>> | K  | V  |
>>> +++
>>> | 1  | a  |
>>> +++
>>> 1 row selected (0.015 seconds)
>>> ... 5 seconds later
>>> 0: jdbc:phoenix:> select * from test;
>>> +++
>>> | K  | V  |
>>> +++
>>> | 1  | b  |
>>> +++
>>> 1 row selected (0.026 seconds)
 On 24 Aug 2017, at 21:38, Batyrshin Alexander <0x62...@gmail.com 
  >> wrote:
 
  Hello,
 
 How to decrease or even eliminate delay between direct HBase put (for 
 example from HBase shell) and SELECT from Phoenix?
 
 My table has only 1 VERSION and do not use any block cache ( {NAME => 
 'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not 
 understand where previous value for SELECT come from.
> 
> 



Re: Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-25 Thread James Taylor
Phoenix retrieves the server timestamp from the region server that hosts
the system catalog table and uses that as the timestamp of the puts when
you do an UPSERT VALUE (FYI, this behavior will change in 4.12 and we'll
use latest timestamp everywhere). I suspect the puts you're doing are going
to a different region server and the clocks on the servers in your cluster
are not synchronized.

If that's the case, the best option is to make sure your clocks are
synchronized as that'll prevent other weird, unexpected behavior. If that's
not an option one workaround would be to set the CURRENT_SCN property on
your connection to HConstants.LATEST_TIMESTAMP like this:

props.put(PhoenixRuntime.CURRENT_SCN_ATTRIB,
Long.toString(HContants.LATEST_TIMESTAMP));
conn = DriverManager.getConnection(getUrl(), props);




On Fri, Aug 25, 2017 at 10:14 AM, Batyrshin Alexander <0x62...@gmail.com>
wrote:

> Its coming from scan time ragne. If i run sqlline with 'currentSCN' from
> the future then select retrieve fresh data immediatly.
>
> We already have software that write with HBase API. Now we build client
> that works with data from HBase via Phoenix.
>
>
>
> On 25 Aug 2017, at 19:35, Josh Elser  wrote:
>
> Calls to put in the HBase shell, to the best of my knowledge, are
> synchronous. You should not have control returned to you until the update
> was committed by the RegionServers. HBase's data guarantees are that once a
> call to write data returns to you, all other readers *must* be able to see
> that update.
>
> I'm not sure where this 3-5 second delay you describe is coming form.
>
> Regardless, why are you writing data to HBase directly and circumventing
> the APIs to write data via Phoenix? If you want to access your data via
> Phoenix, you're going to run into less pain if you work completely at the
> Phoenix API level, (tl;dr use UPSERT to write data)
>
> On 8/24/17 2:58 PM, Batyrshin Alexander wrote:
>
> Here is example:
> CREATE TABLE IF NOT EXISTS test (
>   k VARCHAR NOT NULL,
>   v VARCHAR,
>   CONSTRAINT my_pk PRIMARY KEY (k)
> );
> 0: jdbc:phoenix:> upsert into test(k,v) values ('1', 'a');
> 1 row affected (0.042 seconds)
> 0: jdbc:phoenix:> select * from test;
> +++
> | K  | V  |
> +++
> | 1  | a  |
> +++
> Then:
> hbase(main):014:0> put 'TEST', '1', '0:V', 'b'
> 0 row(s) in 0.0100 seconds
> Result in phoenix will be available after ~ 3-5 seconds:
> 0: jdbc:phoenix:> select * from test;
> +++
> | K  | V  |
> +++
> | 1  | a  |
> +++
> 1 row selected (0.015 seconds)
> ... 5 seconds later
> 0: jdbc:phoenix:> select * from test;
> +++
> | K  | V  |
> +++
> | 1  | b  |
> +++
> 1 row selected (0.026 seconds)
>
> On 24 Aug 2017, at 21:38, Batyrshin Alexander <0x62...@gmail.com  0x62...@gmail.com <0x62...@gmail.com>>> wrote:
>
>  Hello,
>
> How to decrease or even eliminate delay between direct HBase put (for
> example from HBase shell) and SELECT from Phoenix?
>
> My table has only 1 VERSION and do not use any block cache ( {NAME =>
> 'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not
> understand where previous value for SELECT come from.
>
>
>


Re: Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-25 Thread Batyrshin Alexander
Its coming from scan time ragne. If i run sqlline with 'currentSCN' from the 
future then select retrieve fresh data immediatly. 

We already have software that write with HBase API. Now we build client that 
works with data from HBase via Phoenix.


> On 25 Aug 2017, at 19:35, Josh Elser  wrote:
> 
> Calls to put in the HBase shell, to the best of my knowledge, are 
> synchronous. You should not have control returned to you until the update was 
> committed by the RegionServers. HBase's data guarantees are that once a call 
> to write data returns to you, all other readers *must* be able to see that 
> update.
> 
> I'm not sure where this 3-5 second delay you describe is coming form.
> 
> Regardless, why are you writing data to HBase directly and circumventing the 
> APIs to write data via Phoenix? If you want to access your data via Phoenix, 
> you're going to run into less pain if you work completely at the Phoenix API 
> level, (tl;dr use UPSERT to write data)
> 
> On 8/24/17 2:58 PM, Batyrshin Alexander wrote:
>> Here is example:
>> CREATE TABLE IF NOT EXISTS test (
>>   k VARCHAR NOT NULL,
>>   v VARCHAR,
>>   CONSTRAINT my_pk PRIMARY KEY (k)
>> );
>> 0: jdbc:phoenix:> upsert into test(k,v) values ('1', 'a');
>> 1 row affected (0.042 seconds)
>> 0: jdbc:phoenix:> select * from test;
>> +++
>> | K  | V  |
>> +++
>> | 1  | a  |
>> +++
>> Then:
>> hbase(main):014:0> put 'TEST', '1', '0:V', 'b'
>> 0 row(s) in 0.0100 seconds
>> Result in phoenix will be available after ~ 3-5 seconds:
>> 0: jdbc:phoenix:> select * from test;
>> +++
>> | K  | V  |
>> +++
>> | 1  | a  |
>> +++
>> 1 row selected (0.015 seconds)
>> ... 5 seconds later
>> 0: jdbc:phoenix:> select * from test;
>> +++
>> | K  | V  |
>> +++
>> | 1  | b  |
>> +++
>> 1 row selected (0.026 seconds)
>>> On 24 Aug 2017, at 21:38, Batyrshin Alexander <0x62...@gmail.com 
>>>  >> >> wrote:
>>> 
>>>  Hello,
>>> 
>>> How to decrease or even eliminate delay between direct HBase put (for 
>>> example from HBase shell) and SELECT from Phoenix?
>>> 
>>> My table has only 1 VERSION and do not use any block cache ( {NAME => 
>>> 'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not 
>>> understand where previous value for SELECT come from.



Re: Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-25 Thread Josh Elser
Calls to put in the HBase shell, to the best of my knowledge, are 
synchronous. You should not have control returned to you until the 
update was committed by the RegionServers. HBase's data guarantees are 
that once a call to write data returns to you, all other readers *must* 
be able to see that update.


I'm not sure where this 3-5 second delay you describe is coming form.

Regardless, why are you writing data to HBase directly and circumventing 
the APIs to write data via Phoenix? If you want to access your data via 
Phoenix, you're going to run into less pain if you work completely at 
the Phoenix API level, (tl;dr use UPSERT to write data)


On 8/24/17 2:58 PM, Batyrshin Alexander wrote:

Here is example:

CREATE TABLE IF NOT EXISTS test (
   k VARCHAR NOT NULL,
   v VARCHAR,
   CONSTRAINT my_pk PRIMARY KEY (k)
);

0: jdbc:phoenix:> upsert into test(k,v) values ('1', 'a');
1 row affected (0.042 seconds)
0: jdbc:phoenix:> select * from test;
+++
| K  | V  |
+++
| 1  | a  |
+++


Then:

hbase(main):014:0> put 'TEST', '1', '0:V', 'b'
0 row(s) in 0.0100 seconds

Result in phoenix will be available after ~ 3-5 seconds:

0: jdbc:phoenix:> select * from test;
+++
| K  | V  |
+++
| 1  | a  |
+++
1 row selected (0.015 seconds)

... 5 seconds later

0: jdbc:phoenix:> select * from test;
+++
| K  | V  |
+++
| 1  | b  |
+++
1 row selected (0.026 seconds)


On 24 Aug 2017, at 21:38, Batyrshin Alexander <0x62...@gmail.com 
> wrote:


 Hello,

How to decrease or even eliminate delay between direct HBase put (for 
example from HBase shell) and SELECT from Phoenix?


My table has only 1 VERSION and do not use any block cache ( {NAME => 
'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not 
understand where previous value for SELECT come from.




Re: Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-24 Thread Batyrshin Alexander
Here is example:

CREATE TABLE IF NOT EXISTS test (
  k VARCHAR NOT NULL,
  v VARCHAR,
  CONSTRAINT my_pk PRIMARY KEY (k)
);

0: jdbc:phoenix:> upsert into test(k,v) values ('1', 'a');
1 row affected (0.042 seconds)
0: jdbc:phoenix:> select * from test;
+++
| K  | V  |
+++
| 1  | a  |
+++


Then:

hbase(main):014:0> put 'TEST', '1', '0:V', 'b'
0 row(s) in 0.0100 seconds

Result in phoenix will be available after ~ 3-5 seconds:

0: jdbc:phoenix:> select * from test;
+++
| K  | V  |
+++
| 1  | a  |
+++
1 row selected (0.015 seconds)

... 5 seconds later

0: jdbc:phoenix:> select * from test;
+++
| K  | V  |
+++
| 1  | b  |
+++
1 row selected (0.026 seconds)


> On 24 Aug 2017, at 21:38, Batyrshin Alexander <0x62...@gmail.com> wrote:
> 
>  Hello,
> 
> How to decrease or even eliminate delay between direct HBase put (for example 
> from HBase shell) and SELECT from Phoenix?
> 
> My table has only 1 VERSION and do not use any block cache ( {NAME => 
> 'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not 
> understand where previous value for SELECT come from.



Delay between put from HBase shell and result in SELECT from Phoenix

2017-08-24 Thread Batyrshin Alexander
 Hello,

How to decrease or even eliminate delay between direct HBase put (for example 
from HBase shell) and SELECT from Phoenix?

My table has only 1 VERSION and do not use any block cache ( {NAME => 
'invoice', COMPRESSION => 'LZO', BLOCKCACHE => 'false'} ), so i do not 
understand where previous value for SELECT come from.