[jira] [Created] (PHOENIX-5504) Metric calculation and understanding of these values in Phoenix

2019-10-01 Thread Prashant Agrawal (Jira)
Prashant Agrawal created PHOENIX-5504:
-

 Summary: Metric calculation and understanding of these values in 
Phoenix
 Key: PHOENIX-5504
 URL: https://issues.apache.org/jira/browse/PHOENIX-5504
 Project: Phoenix
  Issue Type: Task
Reporter: Prashant Agrawal


Hi Team,

We are using Phoenix for querying the data from Hbase and seeing a discrepancy 
in the metrics logged by the phoenix. So can someone please help to understand 
the same. Below is a use case for same:

1) I ran a query as select * from "db"."table" where "status" = "ACTIVE";

2) Now I added a normal java clock at start of query and at end of result 
extraction and metric extraction.

3) Metrics are extracted by:
 Map overallQueryMetrics = 
PhoenixRuntime.getOverAllReadRequestMetricInfo(resultSet);
 Map> requestReadMetrics = 
PhoenixRuntime.getRequestReadMetricInfo(resultSet);

4) So a quick code snippet is like:
{code:java}
- Timer start to calculate duration
- Perform the query and get resultset
- Extract and read the resultset
- Extract the getOverAllReadRequestMetricInfo and getRequestReadMetricInfo from 
resultSet
- Stop the time and calculate the field as duration.{code}
5) After doing so the metrics are coming as:
Sample 1: (all times in millis)
{code:java}
duration : 151
WALL_CLOCK_TIME_MS : 292
TASK_EXECUTION_TIME : 510
TASK_END_TO_END_TIME : 514
RESULT_SET_TIME_MS : 292
TASK_EXECUTED_COUNTER: 5{code}

Sample 2: (time is Milis)
{code:java}
duration 2,750
RESULT_SET_TIME_MS 5,456
TASK_END_TO_END_TIME 12
TASK_EXECUTED_COUNTER 1
TASK_EXECUTION_TIME 11
TASK_QUEUE_WAIT_TIME 1
TASK_REJECTED_COUNTER 0
WALL_CLOCK_TIME_MS 5,456{code}

So, can someone please let me know that which metrics should be referred as 
time taken by Phoenix to run the query. Because duration calculated by the 
timer is way less than WALL_CLOCK_TIME_MS and any other metrics in the response 
of phoenix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5504) Metric calculation and understanding of these values in Phoenix

2019-10-01 Thread Prashant Agrawal (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Agrawal updated PHOENIX-5504:
--
Description: 
Hi Team,

We are using Phoenix for querying the data from Hbase and seeing a discrepancy 
in the metrics logged by the phoenix. So can someone please help to understand 
the same. Below is a use case for same:

1) I ran a query as select * from "db"."table" where "status" = "ACTIVE";

2) Now I added a normal java clock at start of query and at end of result 
extraction and metric extraction.

3) Metrics are extracted by:
 Map overallQueryMetrics = 
PhoenixRuntime.getOverAllReadRequestMetricInfo(resultSet);
 Map> requestReadMetrics = 
PhoenixRuntime.getRequestReadMetricInfo(resultSet);

4) So a quick code snippet is like:
{code:java}
- Timer start to calculate duration
- Perform the query and get resultset
- Extract and read the resultset
- Extract the getOverAllReadRequestMetricInfo and getRequestReadMetricInfo from 
resultSet
- Stop the time and calculate the field as duration.{code}
5) After doing so the metrics are coming as:
 Sample 1: (all times in millis)
{code:java}
duration : 151
WALL_CLOCK_TIME_MS : 292
TASK_EXECUTION_TIME : 510
TASK_END_TO_END_TIME : 514
RESULT_SET_TIME_MS : 292
TASK_EXECUTED_COUNTER: 5{code}
Sample 2: (time is Milis)
{code:java}
duration 2,750
RESULT_SET_TIME_MS 5,456
TASK_END_TO_END_TIME 12
TASK_EXECUTED_COUNTER 1
TASK_EXECUTION_TIME 11
TASK_QUEUE_WAIT_TIME 1
TASK_REJECTED_COUNTER 0
WALL_CLOCK_TIME_MS 5,456{code}
So, can someone please let me know that which metrics should be referred as 
time taken by Phoenix to run the query. Because duration calculated by the 
timer is way less than WALL_CLOCK_TIME_MS and any other metrics in the response 
of phoenix.

 

*PS: Sorry if it seems like a spam here but could not find any dedicated forum 
to ask hence created the same over here.* 

  was:
Hi Team,

We are using Phoenix for querying the data from Hbase and seeing a discrepancy 
in the metrics logged by the phoenix. So can someone please help to understand 
the same. Below is a use case for same:

1) I ran a query as select * from "db"."table" where "status" = "ACTIVE";

2) Now I added a normal java clock at start of query and at end of result 
extraction and metric extraction.

3) Metrics are extracted by:
 Map overallQueryMetrics = 
PhoenixRuntime.getOverAllReadRequestMetricInfo(resultSet);
 Map> requestReadMetrics = 
PhoenixRuntime.getRequestReadMetricInfo(resultSet);

4) So a quick code snippet is like:
{code:java}
- Timer start to calculate duration
- Perform the query and get resultset
- Extract and read the resultset
- Extract the getOverAllReadRequestMetricInfo and getRequestReadMetricInfo from 
resultSet
- Stop the time and calculate the field as duration.{code}
5) After doing so the metrics are coming as:
Sample 1: (all times in millis)
{code:java}
duration : 151
WALL_CLOCK_TIME_MS : 292
TASK_EXECUTION_TIME : 510
TASK_END_TO_END_TIME : 514
RESULT_SET_TIME_MS : 292
TASK_EXECUTED_COUNTER: 5{code}

Sample 2: (time is Milis)
{code:java}
duration 2,750
RESULT_SET_TIME_MS 5,456
TASK_END_TO_END_TIME 12
TASK_EXECUTED_COUNTER 1
TASK_EXECUTION_TIME 11
TASK_QUEUE_WAIT_TIME 1
TASK_REJECTED_COUNTER 0
WALL_CLOCK_TIME_MS 5,456{code}

So, can someone please let me know that which metrics should be referred as 
time taken by Phoenix to run the query. Because duration calculated by the 
timer is way less than WALL_CLOCK_TIME_MS and any other metrics in the response 
of phoenix.


> Metric calculation and understanding of these values in Phoenix
> ---
>
> Key: PHOENIX-5504
> URL: https://issues.apache.org/jira/browse/PHOENIX-5504
> Project: Phoenix
>  Issue Type: Task
>Reporter: Prashant Agrawal
>Priority: Major
>
> Hi Team,
> We are using Phoenix for querying the data from Hbase and seeing a 
> discrepancy in the metrics logged by the phoenix. So can someone please help 
> to understand the same. Below is a use case for same:
> 1) I ran a query as select * from "db"."table" where "status" = "ACTIVE";
> 2) Now I added a normal java clock at start of query and at end of result 
> extraction and metric extraction.
> 3) Metrics are extracted by:
>  Map overallQueryMetrics = 
> PhoenixRuntime.getOverAllReadRequestMetricInfo(resultSet);
>  Map> requestReadMetrics = 
> PhoenixRuntime.getRequestReadMetricInfo(resultSet);
> 4) So a quick code snippet is like:
> {code:java}
> - Timer start to calculate duration
> - Perform the query and get resultset
> - Extract and read the resultset
> - Extract the getOverAllReadRequestMetricInfo and getRequestReadMetricInfo 
> from resultSet
> - Stop the time and calculate the field as duration.{code}
> 5) After doing so the metrics are coming as:
>  Sample 1: (all times in millis)
> {code:java}
> duration 

ApacheCon North America 2020, project participation

2019-10-01 Thread Rich Bowen
Hi, folks,

(Note: You're receiving this email because you're on the dev@ list for
one or more Apache Software Foundation projects.)

For ApacheCon North America 2019, we asked projects to participate in
the creation of project/topic specific tracks. This was very successful,
with about 15 projects stepping up to curate the content for their
track/summit/event.

We need to know if you're going to do the same for 2020. This informs
how large a venue we book for the event, how long the event runs, and
many other considerations.

If you intend to participate again in 2020, we need to hear from you on
the plann...@apachecon.com mailing list. This is not a firm commitment,
but we need to know if you're, say, 75% confident that you'll be
participating.

And, no, we do not have any details at all, but assume that it will be
in roughly the same calendar space as this year's event, ie, somewhere
in the August-October timeframe.

Thanks.

-- 
Rich Bowen
VP Conferences
The Apache Software Foundation
@apachecon


[jira] [Created] (PHOENIX-5505) Index read repair does not repair unverified rows with higher timestamp

2019-10-01 Thread Kadir Ozdemir (Jira)
Kadir Ozdemir created PHOENIX-5505:
--

 Summary: Index read repair does not repair unverified rows with 
higher timestamp 
 Key: PHOENIX-5505
 URL: https://issues.apache.org/jira/browse/PHOENIX-5505
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.14.3, 5.0.0, 4.15.0
Reporter: Kadir Ozdemir


Read repair (GlobalIndexChecker) sets the time range for the scan on the data 
table for using the timestamp of the index table row to be repaired. The start 
time for the scan is the timestamp of the index row in the current 
implementation. However, if the index row timestamp is higher than the data 
table row timestamp, then the data table row will not be visible to the scan. 
The index row timestamp can be higher when the index row is overwritten with 
the unverified row status (in the first write phase) but the data table row is 
not overwritten (in the second write phase) due to a failure. In this case, the 
unverified index row will not be rebuilt and will be deleted eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-5503) IndexTool does not rebuild all the rows

2019-10-01 Thread Kadir OZDEMIR (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR reassigned PHOENIX-5503:
--

Assignee: Kadir OZDEMIR

> IndexTool does not rebuild all the rows
> ---
>
> Key: PHOENIX-5503
> URL: https://issues.apache.org/jira/browse/PHOENIX-5503
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Kadir Ozdemir
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5503.master.001.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> IndexTool builds a subset of the rows of an index table. The rows that are 
> created after the time specified by the timestamp of the PTable (i.e., table 
> metadata structure maintained in memory) for the index table are not rebuilt. 
> This timestamp is updated by the PTable builder, that is,  at the time of 
> creating the PTable structure int memory. Based on the comment on the code 
> (see IndexTool.getJob()), this is done to "ensure index tables remains 
> consistent post population".  Such consistency issue may exist for client 
> side index build but PHOENIX-5018 (server side index build) does not have 
> such an issue. However, PHOENIX-5018 did not change this behavior. In order 
> to upgrade an index table to PHOENIX-5156 (new index design), we need to 
> rebuild all the rows that are created before the table is upgraded (not 
> before the timestamp of the index table). Also, we may want to use the 
> IndexTool rebuild an index table online. In that case, we may want include as 
> many rows as possible. This issue is to make sure that IndexTool builds all 
> the rows that exist at the time the tool starts running. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-5505) Index read repair does not repair unverified rows with higher timestamp

2019-10-01 Thread Kadir OZDEMIR (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR reassigned PHOENIX-5505:
--

Assignee: Kadir OZDEMIR

> Index read repair does not repair unverified rows with higher timestamp 
> 
>
> Key: PHOENIX-5505
> URL: https://issues.apache.org/jira/browse/PHOENIX-5505
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Kadir Ozdemir
>Assignee: Kadir OZDEMIR
>Priority: Major
>
> Read repair (GlobalIndexChecker) sets the time range for the scan on the data 
> table for using the timestamp of the index table row to be repaired. The 
> start time for the scan is the timestamp of the index row in the current 
> implementation. However, if the index row timestamp is higher than the data 
> table row timestamp, then the data table row will not be visible to the scan. 
> The index row timestamp can be higher when the index row is overwritten with 
> the unverified row status (in the first write phase) but the data table row 
> is not overwritten (in the second write phase) due to a failure. In this 
> case, the unverified index row will not be rebuilt and will be deleted 
> eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5503) IndexTool does not rebuild all the rows

2019-10-01 Thread Kadir OZDEMIR (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5503:
---
Attachment: PHOENIX-5503.master.002.patch

> IndexTool does not rebuild all the rows
> ---
>
> Key: PHOENIX-5503
> URL: https://issues.apache.org/jira/browse/PHOENIX-5503
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.15.0, 4.14.3
>Reporter: Kadir Ozdemir
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5503.master.001.patch, 
> PHOENIX-5503.master.002.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> IndexTool builds a subset of the rows of an index table. The rows that are 
> created after the time specified by the timestamp of the PTable (i.e., table 
> metadata structure maintained in memory) for the index table are not rebuilt. 
> This timestamp is updated by the PTable builder, that is,  at the time of 
> creating the PTable structure int memory. Based on the comment on the code 
> (see IndexTool.getJob()), this is done to "ensure index tables remains 
> consistent post population".  Such consistency issue may exist for client 
> side index build but PHOENIX-5018 (server side index build) does not have 
> such an issue. However, PHOENIX-5018 did not change this behavior. In order 
> to upgrade an index table to PHOENIX-5156 (new index design), we need to 
> rebuild all the rows that are created before the table is upgraded (not 
> before the timestamp of the index table). Also, we may want to use the 
> IndexTool rebuild an index table online. In that case, we may want include as 
> many rows as possible. This issue is to make sure that IndexTool builds all 
> the rows that exist at the time the tool starts running. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)