[jira] [Created] (PHOENIX-7412) Phoenix 5.2.1 release

2024-09-28 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7412:
-

 Summary: Phoenix 5.2.1 release
 Key: PHOENIX-7412
 URL: https://issues.apache.org/jira/browse/PHOENIX-7412
 Project: Phoenix
  Issue Type: Task
Reporter: Viraj Jasani


# Clean up fix versions
 # Spin RCs + Close the repository on 
[https://repository.apache.org/#stagingRepositories]
 # "Release" stages nexus repository
 # Promote RC artifacts in SVN
 # Update reporter tool with the released version
 # Push signed release tag
 # Add release version to the download page
 # Send announce email



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7412) Phoenix 5.2.1 release

2024-09-28 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7412:
-

Assignee: Viraj Jasani

> Phoenix 5.2.1 release
> -
>
> Key: PHOENIX-7412
> URL: https://issues.apache.org/jira/browse/PHOENIX-7412
> Project: Phoenix
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> # Clean up fix versions
>  # Spin RCs + Close the repository on 
> [https://repository.apache.org/#stagingRepositories]
>  # "Release" stages nexus repository
>  # Promote RC artifacts in SVN
>  # Update reporter tool with the released version
>  # Push signed release tag
>  # Add release version to the download page
>  # Send announce email



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7385) Fix MetadataGetTableReadLockIT flapper

2024-09-28 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7385:
--
Fix Version/s: 5.2.1
   5.3.0

> Fix MetadataGetTableReadLockIT flapper
> --
>
> Key: PHOENIX-7385
> URL: https://issues.apache.org/jira/browse/PHOENIX-7385
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Palash Chauhan
>Assignee: Palash Chauhan
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> MetadataGetTableReadLockIT.
> testBlockedReadDoesNotBlockAnotherRead creates 2 threads. First thread takes 
> a read lock while performing a query and sleeps for a duration. Second thread 
> also performs a query. The test verifies that the second thread was not 
> blocked i.e. it finishes quicker than the first thread's sleep duration. The 
> test setup removes MetaDataEndpointImpl coproc and loads an extension of it 
> which can be configured to sleep after acquiring a lock. 
> Currently, the test class is annotated with ParallelStatsDisabledTest which 
> means other tests can interfere with the coproc loaded on SYSTEM.CATALOG. We 
> can annotate it with 
> NeedsOwnMiniClusterTest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7402) Even if a row is updated within TTL its getting expired partially

2024-09-28 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7402.
---
Resolution: Fixed

> Even if a row is updated within TTL its getting expired partially
> -
>
> Key: PHOENIX-7402
> URL: https://issues.apache.org/jira/browse/PHOENIX-7402
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0, 5.2.1, 5.3.0
>Reporter: Sanjeet Malhotra
>Assignee: Sanjeet Malhotra
>Priority: Critical
> Fix For: 5.2.1, 5.3.0
>
>
> Even though a row is being updated within TTL but still its getting expired 
> partially. Following IT in MaxLookbackExtendedIT can be used to reproduce the 
> issue:
> {code:java}
> @Test
> public void testRetainingLastRowVersion() throws Exception {
> if (multiCF) {
> return;
> }
> if(hasTableLevelMaxLookback) {
> optionBuilder.append(", MAX_LOOKBACK_AGE=" + 
> TABLE_LEVEL_MAX_LOOKBACK_AGE * 1000);
> tableDDLOptions = optionBuilder.toString();
> }
> try(Connection conn = DriverManager.getConnection(getUrl())) {
> String tableName = generateUniqueName();
> createTable(tableName);
> injectEdge.setValue(System.currentTimeMillis());
> EnvironmentEdgeManager.injectEdge(injectEdge);
> injectEdge.incrementValue(1);
> Statement stmt = conn.createStatement();
> stmt.execute("upsert into " + tableName + " values ('a', 'ab', 'abc', 
> 'abcd')");
> conn.commit();
> injectEdge.incrementValue(16 * 1000);
> stmt.execute("upsert into " + tableName + " values ('a', 'ab1')");
> conn.commit();
> injectEdge.incrementValue(16 * 1000);
> stmt.execute("upsert into " + tableName + " values ('a', 'ab2')");
> conn.commit();
> injectEdge.incrementValue(16 * 1000);
> stmt.execute("upsert into " + tableName + " values ('a', 'ab3')");
> conn.commit();
> injectEdge.incrementValue(11 * 1000);
> stmt.execute("upsert into " + tableName + " values ('b', 'bc', 'bcd', 
> 'bcde')");
> conn.commit();
> injectEdge.incrementValue(1);
> TableName dataTableName = TableName.valueOf(tableName);
> TestUtil.dumpTable(conn, dataTableName);
> flush(dataTableName);
> injectEdge.incrementValue(1);
> TestUtil.dumpTable(conn, dataTableName);
> majorCompact(dataTableName);
> TestUtil.dumpTable(conn, dataTableName);
> injectEdge.incrementValue(1);
> ResultSet rs = stmt.executeQuery("select * from " + dataTableName + " 
> where id = 'a'");
> //TestUtil.printResultSet(rs);
> while(rs.next()) {
> assertNotNull(rs.getString(3));
> assertNotNull(rs.getString(4));
> }
> }
> } {code}
> The TTL in above IT is 30 sec and table level max lookback age is 10 sec with 
> cluster level max lookback = 15 sec.
> The IT is failing at the last two checks:
> {code:java}
> while(rs.next()) { 
>assertNotNull(rs.getString(3)); 
>assertNotNull(rs.getString(4)); 
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7411) Atomic Delete: PhoenixStatement API to return row if single row is atomically deleted

2024-09-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7411:
--
Fix Version/s: 5.3.0

> Atomic Delete: PhoenixStatement API to return row if single row is atomically 
> deleted
> -
>
> Key: PHOENIX-7411
> URL: https://issues.apache.org/jira/browse/PHOENIX-7411
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> PHOENIX-7398 introduces new PhoenixStatement API to return row for 
> Atomic/Conditional Upserts. Phoenix already supports Upserts as Atomic 
> operation by using ON DUPLICATE KEY clause. However, Deletes are not 
> supported as atomic and single Delete query can also delete multiple rows 
> based on the WHERE clause used.
> HBase does not provide API to return row state for atomic updates with Put 
> and Delete mutations. HBase API checkAndMutate() also supports returning 
> Result for Append and Increment mutations only, not for Put and Delete 
> mutations.
> The purpose of this Jira is to introduce support for atomic delete of single 
> row that can return the row only if it existed before executing the Delete 
> mutation. If the row is already deleted, the API support is not expected to 
> return any row. For single row delete to be atomic, IndexRegionObserver needs 
> to take row lock for to be deleted row and also scan the row, similar to 
> Atomic Put mutation(s).
>  
> PhoenixStatement API signature is same as of PHOENIX-7398:
> {code:java}
> /**
>  * Executes the given SQL statement similar to JDBC API executeUpdate() but 
> also returns the
>  * updated or non-updated row as Result object back to the client. This must 
> be used with
>  * auto-commit Connection. This makes the operation atomic.
>  * If the row is successfully updated, return the updated row, otherwise if 
> the row
>  * cannot be updated, return non-updated row.
>  *
>  * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
>  * @return The pair of int and Tuple, where int represents value 1 for 
> successful row
>  * update and 0 for non-successful row update, and Tuple represents the state 
> of the row.
>  * @throws SQLException If the statement cannot be executed.
>  */
> public Pair executeUpdateReturnRow(String sql) throws 
> SQLException {{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7411) Atomic Delete: PhoenixStatement API to return row if single row is atomically deleted

2024-09-24 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7411:
-

 Summary: Atomic Delete: PhoenixStatement API to return row if 
single row is atomically deleted
 Key: PHOENIX-7411
 URL: https://issues.apache.org/jira/browse/PHOENIX-7411
 Project: Phoenix
  Issue Type: New Feature
Reporter: Viraj Jasani


PHOENIX-7398 introduces new PhoenixStatement API to return row for 
Atomic/Conditional Upserts. Phoenix already supports Upserts as Atomic 
operation by using ON DUPLICATE KEY clause. However, Deletes are not supported 
as atomic and single Delete query can also delete multiple rows based on the 
WHERE clause used.

HBase does not provide API to return row state for atomic updates with Put and 
Delete mutations. HBase API checkAndMutate() also supports returning Result for 
Append and Increment mutations only, not for Put and Delete mutations.

The purpose of this Jira is to introduce support for atomic delete of single 
row that can return the row only if it existed before executing the Delete 
mutation. If the row is already deleted, the API support is not expected to 
return any row. For single row delete to be atomic, IndexRegionObserver needs 
to take row lock for to be deleted row and also scan the row, similar to Atomic 
Put mutation(s).

 

PhoenixStatement API signature is same as of PHOENIX-7398:
{code:java}
/**
 * Executes the given SQL statement similar to JDBC API executeUpdate() but 
also returns the
 * updated or non-updated row as Result object back to the client. This must be 
used with
 * auto-commit Connection. This makes the operation atomic.
 * If the row is successfully updated, return the updated row, otherwise if the 
row
 * cannot be updated, return non-updated row.
 *
 * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
 * @return The pair of int and Tuple, where int represents value 1 for 
successful row
 * update and 0 for non-successful row update, and Tuple represents the state 
of the row.
 * @throws SQLException If the statement cannot be executed.
 */
public Pair executeUpdateReturnRow(String sql) throws 
SQLException {{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7411) Atomic Delete: PhoenixStatement API to return row if single row is atomically deleted

2024-09-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7411:
-

Assignee: Viraj Jasani

> Atomic Delete: PhoenixStatement API to return row if single row is atomically 
> deleted
> -
>
> Key: PHOENIX-7411
> URL: https://issues.apache.org/jira/browse/PHOENIX-7411
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> PHOENIX-7398 introduces new PhoenixStatement API to return row for 
> Atomic/Conditional Upserts. Phoenix already supports Upserts as Atomic 
> operation by using ON DUPLICATE KEY clause. However, Deletes are not 
> supported as atomic and single Delete query can also delete multiple rows 
> based on the WHERE clause used.
> HBase does not provide API to return row state for atomic updates with Put 
> and Delete mutations. HBase API checkAndMutate() also supports returning 
> Result for Append and Increment mutations only, not for Put and Delete 
> mutations.
> The purpose of this Jira is to introduce support for atomic delete of single 
> row that can return the row only if it existed before executing the Delete 
> mutation. If the row is already deleted, the API support is not expected to 
> return any row. For single row delete to be atomic, IndexRegionObserver needs 
> to take row lock for to be deleted row and also scan the row, similar to 
> Atomic Put mutation(s).
>  
> PhoenixStatement API signature is same as of PHOENIX-7398:
> {code:java}
> /**
>  * Executes the given SQL statement similar to JDBC API executeUpdate() but 
> also returns the
>  * updated or non-updated row as Result object back to the client. This must 
> be used with
>  * auto-commit Connection. This makes the operation atomic.
>  * If the row is successfully updated, return the updated row, otherwise if 
> the row
>  * cannot be updated, return non-updated row.
>  *
>  * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
>  * @return The pair of int and Tuple, where int represents value 1 for 
> successful row
>  * update and 0 for non-successful row update, and Tuple represents the state 
> of the row.
>  * @throws SQLException If the statement cannot be executed.
>  */
> public Pair executeUpdateReturnRow(String sql) throws 
> SQLException {{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7343) Support for complex types in CDC

2024-09-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7343:
--
Fix Version/s: 5.3.0

> Support for complex types in CDC
> 
>
> Key: PHOENIX-7343
> URL: https://issues.apache.org/jira/browse/PHOENIX-7343
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Hari Krishna Dara
>Assignee: Hari Krishna Dara
>Priority: Major
> Fix For: 5.3.0
>
>
> Support for the two complex types, viz., ARRAY and JSON need to be added for 
> CDC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7343) Support for complex types in CDC

2024-09-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7343.
---
Resolution: Fixed

> Support for complex types in CDC
> 
>
> Key: PHOENIX-7343
> URL: https://issues.apache.org/jira/browse/PHOENIX-7343
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Hari Krishna Dara
>Assignee: Hari Krishna Dara
>Priority: Major
> Fix For: 5.3.0
>
>
> Support for the two complex types, viz., ARRAY and JSON need to be added for 
> CDC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7357) New variable length binary data type: VARBINARY_ENCODED

2024-09-18 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7357.
---
Resolution: Fixed

> New variable length binary data type: VARBINARY_ENCODED
> ---
>
> Key: PHOENIX-7357
> URL: https://issues.apache.org/jira/browse/PHOENIX-7357
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> As of today, Phoenix provides several variable length as well as fixed length 
> data types. One of the variable length data types is VARBINARY. It is 
> variable length binary blob. Using VARBINARY as only primary key can be 
> considered as if using HBase row key.
> HBase provides a single row key. Any client application that requires using 
> more than one column for primary keys, using HBase requires special handling 
> of storing both column values as a single binary row key. Phoenix provides 
> the ability to use more than one primary key by providing composite primary 
> keys. Composite primary key can contain any number of primary key columns. 
> Phoenix also provides the ability to add new nullable primary key columns to 
> the existing composite primary keys. Phoenix uses HBase as its backing store. 
> In order to provide the ability for users to define multiple primary keys, 
> Phoenix internally concatenates binary encoded values of each primary key 
> column value and uses concatenated binary value as HBase row key. In order to 
> efficiently concatenate as well as retrieve individual primary key values, 
> Phoenix implements two ways:
>  # For fixed length columns: The length of the given column is determined by 
> the maximum length of the column. As part of the read flow, while iterating 
> through the row key, fixed length numbers of bytes are retrieved while 
> reading. While writing, if the original encoded value of the given column has 
> less number of bytes, additional null bytes (\x00) are padded until the fixed 
> length is filled up. Hence, for smaller values, we end up wasting some space.
>  # For variable length columns: Since we cannot know the length of the value 
> of variable length data type in advance, a separator or terminator byte is 
> used. Phoenix uses null byte as separator (\x00) byte. As of today, VARCHAR 
> is the most commonly used variable length data type and since VARCHAR 
> represents String, null byte is not part of valid String characters. Hence, 
> it can be effectively used to determine when to terminate the given VARCHAR 
> value.
>  
> The null byte (\x00) works fine as a separator for VARCHAR. However, it 
> cannot be used as a separator byte for VARBINARY because VARBINARY can 
> contain any binary blob values. Due to this, Phoenix has restrictions for 
> VARBINARY type: 
>  
>  # It can only be used as the last part of the composite primary key.
>  # It cannot be used as a DESC order primary key column.
>  
> Using VARBINARY data type as an earlier portion of the composite primary key 
> is a valid use case. One can also use multiple VARBINARY primary key columns. 
> After all, Phoenix provides the ability to use multiple primary key columns 
> for users.
> Besides, using secondary index on data table means that the composite primary 
> key of secondary index table includes: 
>   …  
>   … 
>  
> As primary key columns are appended to the secondary indexes columns, one 
> cannot create a secondary index on any VARBINARY column.
> The proposal of this Jira is to introduce new data type 
> {*}VARBINARY_ENCODED{*}, which has no restriction of being considered as 
> composite primary key prefix or using it as DESC ordered column.
> This means, we need to effectively distinguish where the variable length 
> binary data terminates in the absence of fixed length information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7395) Metadata Cache metrics at server and client side

2024-09-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7395:
--
Fix Version/s: 5.2.1
   5.3.0

> Metadata Cache metrics at server and client side
> 
>
> Key: PHOENIX-7395
> URL: https://issues.apache.org/jira/browse/PHOENIX-7395
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
> Guava cache) at both server and client side. The size of the cache at server 
> side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
> default value of 20 MB. Similarly, the size of the cache at client side is 
> determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 
> MB.
> To understand whether the size of the metadata caches at client and server 
> side are sufficient for the given cluster and the give client, we need some 
> visibility into how efficiently the caches are being utilized.
> The purpose of this Jira is to add some metrics for both of these caches:
>  * Used cache size
>  * Cache hit count
>  * Cache miss count
>  * Cache eviction count
>  * Cache removal count (explicit or replaced)
>  * Cache add count (PTable objects added to the cache)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7395) Metadata Cache metrics at server and client side

2024-09-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7395.
---
Resolution: Fixed

> Metadata Cache metrics at server and client side
> 
>
> Key: PHOENIX-7395
> URL: https://issues.apache.org/jira/browse/PHOENIX-7395
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
> Guava cache) at both server and client side. The size of the cache at server 
> side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
> default value of 20 MB. Similarly, the size of the cache at client side is 
> determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 
> MB.
> To understand whether the size of the metadata caches at client and server 
> side are sufficient for the given cluster and the give client, we need some 
> visibility into how efficiently the caches are being utilized.
> The purpose of this Jira is to add some metrics for both of these caches:
>  * Used cache size
>  * Cache hit count
>  * Cache miss count
>  * Cache eviction count
>  * Cache removal count (explicit or replaced)
>  * Cache add count (PTable objects added to the cache)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7402) Even if a row is updated with TTL its getting expired partially

2024-09-12 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7402:
--
Fix Version/s: 5.2.1
   5.3.0

> Even if a row is updated with TTL its getting expired partially
> ---
>
> Key: PHOENIX-7402
> URL: https://issues.apache.org/jira/browse/PHOENIX-7402
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0, 5.2.1, 5.3.0
>Reporter: Sanjeet Malhotra
>Assignee: Sanjeet Malhotra
>Priority: Critical
> Fix For: 5.2.1, 5.3.0
>
>
> Even though a row is being updated within TTL but still its getting expired 
> partially. Following IT in MaxLookbackExtendedIT can be used to reproduce the 
> issue:
> {code:java}
> @Test
> public void testRetainingLastRowVersion() throws Exception {
> if (multiCF) {
> return;
> }
> if(hasTableLevelMaxLookback) {
> optionBuilder.append(", MAX_LOOKBACK_AGE=" + 
> TABLE_LEVEL_MAX_LOOKBACK_AGE * 1000);
> tableDDLOptions = optionBuilder.toString();
> }
> try(Connection conn = DriverManager.getConnection(getUrl())) {
> String tableName = generateUniqueName();
> createTable(tableName);
> injectEdge.setValue(System.currentTimeMillis());
> EnvironmentEdgeManager.injectEdge(injectEdge);
> injectEdge.incrementValue(1);
> Statement stmt = conn.createStatement();
> stmt.execute("upsert into " + tableName + " values ('a', 'ab', 'abc', 
> 'abcd')");
> conn.commit();
> injectEdge.incrementValue(16 * 1000);
> stmt.execute("upsert into " + tableName + " values ('a', 'ab1')");
> conn.commit();
> injectEdge.incrementValue(16 * 1000);
> stmt.execute("upsert into " + tableName + " values ('a', 'ab2')");
> conn.commit();
> injectEdge.incrementValue(16 * 1000);
> stmt.execute("upsert into " + tableName + " values ('a', 'ab3')");
> conn.commit();
> injectEdge.incrementValue(11 * 1000);
> stmt.execute("upsert into " + tableName + " values ('b', 'bc', 'bcd', 
> 'bcde')");
> conn.commit();
> injectEdge.incrementValue(1);
> TableName dataTableName = TableName.valueOf(tableName);
> TestUtil.dumpTable(conn, dataTableName);
> flush(dataTableName);
> injectEdge.incrementValue(1);
> TestUtil.dumpTable(conn, dataTableName);
> majorCompact(dataTableName);
> TestUtil.dumpTable(conn, dataTableName);
> injectEdge.incrementValue(1);
> ResultSet rs = stmt.executeQuery("select * from " + dataTableName + " 
> where id = 'a'");
> //TestUtil.printResultSet(rs);
> while(rs.next()) {
> assertNotNull(rs.getString(3));
> assertNotNull(rs.getString(4));
> }
> }
> } {code}
> The TTL in above IT is 30 sec and table level max lookback age is 10 sec with 
> cluster level max lookback = 15 sec.
> The IT is failing at the last two checks:
> {code:java}
> while(rs.next()) { 
>assertNotNull(rs.getString(3)); 
>assertNotNull(rs.getString(4)); 
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7398) New PhoenixStatement API to return row for Atomic/Conditional Upserts

2024-09-10 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7398.
---
Resolution: Fixed

> New PhoenixStatement API to return row for Atomic/Conditional Upserts
> -
>
> Key: PHOENIX-7398
> URL: https://issues.apache.org/jira/browse/PHOENIX-7398
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> Phoenix supports Atomic Conditional Upserts with ON DUPLICATE KEY clause, 
> allowing clients to either ignore the update if the row with the given 
> primary key(s) already exist or conditionally update the row with the given 
> primary key(s). Phoenix also supports returning 0 or 1 as updated status code 
> for the given Atomic/Conditional Upserts, where 0 represents no updates to 
> the row and 1 represents the updated row (when the condition is satisfied by 
> the atomic upsert).
> While standard JDBC APIs support Upserts with the status code returned as 
> integer values 0 or 1, some client applications also require returning the 
> current row state back to the client depending on whether the row is 
> successfully updated. If the condition is satisfied by the atomic upsert and 
> the row is updated, the client application should be able to get the updated 
> row status as HBase Result object. Similarly, if the condition is not 
> satisfied for the given atomic upsert and therefore, row is not updated, the 
> client application should be able to get the old/current row status as HBase 
> Result object.
> This support is somewhat similar in nature to what HBase provides for 
> Increment and Append mutations. Both operations return updated row as Result 
> object. For instance,
> {code:java}
> /**
>  * Appends values to one or more columns within a single row.
>  * 
>  * This operation guaranteed atomicity to readers. Appends are done under a 
> single row lock, so
>  * write operations to a row are synchronized, and readers are guaranteed to 
> see this operation
>  * fully completed.
>  * @param append object that specifies the columns and values to be appended
>  * @throws IOException e
>  * @return values of columns after the append operation (maybe null)
>  */
> default Result append(final Append append) throws IOException {
> {code}
>  
> HBase does not provide API to return row state for atomic updates with Put 
> and Delete mutations. HBase API checkAndMutate() also supports returning 
> Result for Append and Increment mutations only, not for Put and Delete 
> mutations.
> Phoenix supports batchMutate() with Put mutation(s) by making it Atomic. 
> Hence, for client applications using Phoenix provided atomic upserts, it 
> would be really beneficial to also provide new API at PhoenixStatement and 
> PhoenixPreparedStatement layer that can return the row as Result object back 
> to the client.
> The proposed API signature:
> {code:java}
> /**
>  * Executes the given SQL statement similar to JDBC API executeUpdate() but 
> also returns the
>  * updated or non-updated row as Result object back to the client. This must 
> be used with
>  * auto-commit Connection. This makes the operation atomic.
>  * If the row is successfully updated, return the updated row, otherwise if 
> the row
>  * cannot be updated, return non-updated row.
>  *
>  * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
>  * @return The pair of int and Tuple, where int represents value 1 for 
> successful row
>  * update and 0 for non-successful row update, and Tuple represents the state 
> of the row.
>  * @throws SQLException If the statement cannot be executed.
>  */
> public Pair executeUpdateReturnRow(String sql) throws 
> SQLException {{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7398) New PhoenixStatement API to return row for Atomic/Conditional Upserts

2024-09-10 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7398:
--
Fix Version/s: 5.3.0

> New PhoenixStatement API to return row for Atomic/Conditional Upserts
> -
>
> Key: PHOENIX-7398
> URL: https://issues.apache.org/jira/browse/PHOENIX-7398
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> Phoenix supports Atomic Conditional Upserts with ON DUPLICATE KEY clause, 
> allowing clients to either ignore the update if the row with the given 
> primary key(s) already exist or conditionally update the row with the given 
> primary key(s). Phoenix also supports returning 0 or 1 as updated status code 
> for the given Atomic/Conditional Upserts, where 0 represents no updates to 
> the row and 1 represents the updated row (when the condition is satisfied by 
> the atomic upsert).
> While standard JDBC APIs support Upserts with the status code returned as 
> integer values 0 or 1, some client applications also require returning the 
> current row state back to the client depending on whether the row is 
> successfully updated. If the condition is satisfied by the atomic upsert and 
> the row is updated, the client application should be able to get the updated 
> row status as HBase Result object. Similarly, if the condition is not 
> satisfied for the given atomic upsert and therefore, row is not updated, the 
> client application should be able to get the old/current row status as HBase 
> Result object.
> This support is somewhat similar in nature to what HBase provides for 
> Increment and Append mutations. Both operations return updated row as Result 
> object. For instance,
> {code:java}
> /**
>  * Appends values to one or more columns within a single row.
>  * 
>  * This operation guaranteed atomicity to readers. Appends are done under a 
> single row lock, so
>  * write operations to a row are synchronized, and readers are guaranteed to 
> see this operation
>  * fully completed.
>  * @param append object that specifies the columns and values to be appended
>  * @throws IOException e
>  * @return values of columns after the append operation (maybe null)
>  */
> default Result append(final Append append) throws IOException {
> {code}
>  
> HBase does not provide API to return row state for atomic updates with Put 
> and Delete mutations. HBase API checkAndMutate() also supports returning 
> Result for Append and Increment mutations only, not for Put and Delete 
> mutations.
> Phoenix supports batchMutate() with Put mutation(s) by making it Atomic. 
> Hence, for client applications using Phoenix provided atomic upserts, it 
> would be really beneficial to also provide new API at PhoenixStatement and 
> PhoenixPreparedStatement layer that can return the row as Result object back 
> to the client.
> The proposed API signature:
> {code:java}
> /**
>  * Executes the given SQL statement similar to JDBC API executeUpdate() but 
> also returns the
>  * updated or non-updated row as Result object back to the client. This must 
> be used with
>  * auto-commit Connection. This makes the operation atomic.
>  * If the row is successfully updated, return the updated row, otherwise if 
> the row
>  * cannot be updated, return non-updated row.
>  *
>  * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
>  * @return The pair of int and Tuple, where int represents value 1 for 
> successful row
>  * update and 0 for non-successful row update, and Tuple represents the state 
> of the row.
>  * @throws SQLException If the statement cannot be executed.
>  */
> public Pair executeUpdateReturnRow(String sql) throws 
> SQLException {{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-5117) Return the count of rows scanned in HBase

2024-09-05 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-5117:
--
Priority: Major  (was: Minor)

> Return the count of rows scanned in HBase
> -
>
> Key: PHOENIX-5117
> URL: https://issues.apache.org/jira/browse/PHOENIX-5117
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.14.1
>Reporter: Chen Feng
>Assignee: Palash Chauhan
>Priority: Major
> Fix For: 5.3.0
>
> Attachments: PHOENIX-5117-4.x-HBase-1.4-v1.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v2.patch, PHOENIX-5117-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v4.patch, PHOENIX-5117-4.x-HBase-1.4-v5.patch, 
> PHOENIX-5117-4.x-HBase-1.4-v6.patch, PHOENIX-5117-v1.patch
>
>
> HBASE-5980 provides the ability to return the number of rows scanned. Such 
> metrics should also be returned by Phoenix.
> HBASE-21815 is acquired.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7398) New PhoenixStatement API to return row for Atomic/Conditional Upserts

2024-09-04 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7398:
--
Description: 
Phoenix supports Atomic Conditional Upserts with ON DUPLICATE KEY clause, 
allowing clients to either ignore the update if the row with the given primary 
key(s) already exist or conditionally update the row with the given primary 
key(s). Phoenix also supports returning 0 or 1 as updated status code for the 
given Atomic/Conditional Upserts, where 0 represents no updates to the row and 
1 represents the updated row (when the condition is satisfied by the atomic 
upsert).

While standard JDBC APIs support Upserts with the status code returned as 
integer values 0 or 1, some client applications also require returning the 
current row state back to the client depending on whether the row is 
successfully updated. If the condition is satisfied by the atomic upsert and 
the row is updated, the client application should be able to get the updated 
row status as HBase Result object. Similarly, if the condition is not satisfied 
for the given atomic upsert and therefore, row is not updated, the client 
application should be able to get the old/current row status as HBase Result 
object.

This support is somewhat similar in nature to what HBase provides for Increment 
and Append mutations. Both operations return updated row as Result object. For 
instance,
{code:java}
/**
 * Appends values to one or more columns within a single row.
 * 
 * This operation guaranteed atomicity to readers. Appends are done under a 
single row lock, so
 * write operations to a row are synchronized, and readers are guaranteed to 
see this operation
 * fully completed.
 * @param append object that specifies the columns and values to be appended
 * @throws IOException e
 * @return values of columns after the append operation (maybe null)
 */
default Result append(final Append append) throws IOException {
{code}
 

HBase does not provide API to return row state for atomic updates with Put and 
Delete mutations. HBase API checkAndMutate() also supports returning Result for 
Append and Increment mutations only, not for Put and Delete mutations.

Phoenix supports batchMutate() with Put mutation(s) by making it Atomic. Hence, 
for client applications using Phoenix provided atomic upserts, it would be 
really beneficial to also provide new API at PhoenixStatement and 
PhoenixPreparedStatement layer that can return the row as Result object back to 
the client.

The proposed API signature:
{code:java}
/**
 * Executes the given SQL statement similar to JDBC API executeUpdate() but 
also returns the
 * updated or non-updated row as Result object back to the client. This must be 
used with
 * auto-commit Connection. This makes the operation atomic.
 * If the row is successfully updated, return the updated row, otherwise if the 
row
 * cannot be updated, return non-updated row.
 *
 * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
 * @return The pair of int and Tuple, where int represents value 1 for 
successful row
 * update and 0 for non-successful row update, and Tuple represents the state 
of the row.
 * @throws SQLException If the statement cannot be executed.
 */
public Pair executeUpdateReturnRow(String sql) throws 
SQLException {{code}

  was:
Phoenix supports Atomic Conditional Upserts with ON DUPLICATE KEY clause, 
allowing clients to either ignore the update if the row with the given primary 
key(s) already exist or conditionally update the row with the given primary 
key(s). Phoenix also supports returning 0 or 1 as updated status code for the 
given Atomic/Conditional Upserts, where 0 represents no updates to the row and 
1 represents the updated row (when the condition is satisfied by the atomic 
upsert).

While standard JDBC APIs support Upserts with the status code returned as 
integer values 0 or 1, some client applications also require returning the 
current row state back to the client depending on whether the row is 
successfully updated. If the condition is satisfied by the atomic upsert and 
the row is updated, the client application should be able to get the updated 
row status as HBase Result object. Similarly, if the condition is not satisfied 
for the given atomic upsert and therefore, row is not updated, the client 
application should be able to get the old/current row status as HBase Result 
object.

This support is somewhat similar in nature to what HBase provides for Increment 
and Append mutations. Both operations return updated row as Result object. For 
instance,
{code:java}
/**
 * Appends values to one or more columns within a single row.
 * 
 * This operation guaranteed atomicity to readers. Appends are done under a 
single row lock, so
 * write operations to a row are synchronized, and readers are guaranteed to 
see this operation
 * fully completed.
 * @param append object that spec

[jira] [Resolved] (PHOENIX-7367) Snapshot based mapreduce jobs fails after HBASE-28401

2024-09-03 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7367.
---
Resolution: Fixed

> Snapshot based mapreduce jobs fails after HBASE-28401
> -
>
> Key: PHOENIX-7367
> URL: https://issues.apache.org/jira/browse/PHOENIX-7367
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ujjawal Kumar
>Assignee: Ujjawal Kumar
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
> Attachments: Screenshot 2024-07-19 at 8.18.06 PM.png, Screenshot 
> 2024-07-19 at 8.18.25 PM.png
>
>
> HBASE-28401 had a regression due to which HRegion#close throws NPE while 
> trying to close the memstore within the mapper
> Due to this, snapshot based MR jobs have started failing in phoenix. 
> This is due to the fact that TableSnapshotResultIterator ends up trying to 
> release the read lock twice via HRegion#closeRegionOperation 
>  * TableSnapshotResultIterator's next method [calls ScanningResultIterator's 
> next 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L180].
>  * 
>  ** ScanningResultIterator's [next tries to close the SnapshotScanner 
> early|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-client/src/main/java/org/apache/phoenix/iterate/ScanningResultIterator.java#L225]
>  ** Within [SnapshotScanner's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/SnapshotScanner.java#L180-L187]
>  * 
>  ** 
>  ***  HRegion#closeRegionOperation released the read lock and was successful
>  ***  HRegion#close which threw IOException due to memstore issue 
> (HBASE-28401)
>  ***  SnapshotScanner catches the IOException but doesn't set region field to 
> null
>  * TableSnapshotResultIterator's [finally block calls 
> ScanningResultIterator's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L187-L190].
>  * 
>  ** 
>  *** *ScanningResultIterator's close is called again*
>  *** *Since region field wasn't null,* *HRegion#closeRegionOperation is 
> called again and throws IllegalMonitorStateException while trying to release 
> the read lock*
>  * 
>  ** 
>  *** The IllegalMonitorStateException then causes the whole mapper to fail
> It doesn't cause failure while doing snapshot reads via HBase (ref 
> HBASE-28743 where same NPE was observed but mapper still passes)
> , because the closest equivalent code (RecordReader within 
> TableSnapshotInputFormat) doesn't tries to close the region [as part of it's 
> nextKeyValue 
> method|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L275-L280].
>   
> This is generally much safer [because record readers are always closed 
> explicitly (even if mapper's run method 
> fails)|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java#L466-L481]
> There are 2 improvements that can be done here : 
> 1. Disable mslab for region created within snapshot (by setting 
> hbase.hregion.memstore.mslab.enabled set to false)
> 2. In TableSnapshotResultIterator - Remove the the SnapshotScanner's close 
> (via ScanningResultIterator) called within next method. It would anyways be 
> closed by the mapper at the end



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7367) Snapshot based mapreduce jobs fails after HBASE-28401

2024-09-03 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7367:
--
Fix Version/s: 5.2.1
   5.3.0

> Snapshot based mapreduce jobs fails after HBASE-28401
> -
>
> Key: PHOENIX-7367
> URL: https://issues.apache.org/jira/browse/PHOENIX-7367
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ujjawal Kumar
>Assignee: Ujjawal Kumar
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
> Attachments: Screenshot 2024-07-19 at 8.18.06 PM.png, Screenshot 
> 2024-07-19 at 8.18.25 PM.png
>
>
> HBASE-28401 had a regression due to which HRegion#close throws NPE while 
> trying to close the memstore within the mapper
> Due to this, snapshot based MR jobs have started failing in phoenix. 
> This is due to the fact that TableSnapshotResultIterator ends up trying to 
> release the read lock twice via HRegion#closeRegionOperation 
>  * TableSnapshotResultIterator's next method [calls ScanningResultIterator's 
> next 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L180].
>  * 
>  ** ScanningResultIterator's [next tries to close the SnapshotScanner 
> early|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-client/src/main/java/org/apache/phoenix/iterate/ScanningResultIterator.java#L225]
>  ** Within [SnapshotScanner's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/SnapshotScanner.java#L180-L187]
>  * 
>  ** 
>  ***  HRegion#closeRegionOperation released the read lock and was successful
>  ***  HRegion#close which threw IOException due to memstore issue 
> (HBASE-28401)
>  ***  SnapshotScanner catches the IOException but doesn't set region field to 
> null
>  * TableSnapshotResultIterator's [finally block calls 
> ScanningResultIterator's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L187-L190].
>  * 
>  ** 
>  *** *ScanningResultIterator's close is called again*
>  *** *Since region field wasn't null,* *HRegion#closeRegionOperation is 
> called again and throws IllegalMonitorStateException while trying to release 
> the read lock*
>  * 
>  ** 
>  *** The IllegalMonitorStateException then causes the whole mapper to fail
> It doesn't cause failure while doing snapshot reads via HBase (ref 
> HBASE-28743 where same NPE was observed but mapper still passes)
> , because the closest equivalent code (RecordReader within 
> TableSnapshotInputFormat) doesn't tries to close the region [as part of it's 
> nextKeyValue 
> method|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L275-L280].
>   
> This is generally much safer [because record readers are always closed 
> explicitly (even if mapper's run method 
> fails)|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java#L466-L481]
> There are 2 improvements that can be done here : 
> 1. Disable mslab for region created within snapshot (by setting 
> hbase.hregion.memstore.mslab.enabled set to false)
> 2. In TableSnapshotResultIterator - Remove the the SnapshotScanner's close 
> (via ScanningResultIterator) called within next method. It would anyways be 
> closed by the mapper at the end



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7367) Snapshot based mapreduce jobs fails after HBASE-28401

2024-09-03 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7367:
-

Assignee: Ujjawal Kumar

> Snapshot based mapreduce jobs fails after HBASE-28401
> -
>
> Key: PHOENIX-7367
> URL: https://issues.apache.org/jira/browse/PHOENIX-7367
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Ujjawal Kumar
>Assignee: Ujjawal Kumar
>Priority: Major
> Attachments: Screenshot 2024-07-19 at 8.18.06 PM.png, Screenshot 
> 2024-07-19 at 8.18.25 PM.png
>
>
> HBASE-28401 had a regression due to which HRegion#close throws NPE while 
> trying to close the memstore within the mapper
> Due to this, snapshot based MR jobs have started failing in phoenix. 
> This is due to the fact that TableSnapshotResultIterator ends up trying to 
> release the read lock twice via HRegion#closeRegionOperation 
>  * TableSnapshotResultIterator's next method [calls ScanningResultIterator's 
> next 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L180].
>  * 
>  ** ScanningResultIterator's [next tries to close the SnapshotScanner 
> early|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-client/src/main/java/org/apache/phoenix/iterate/ScanningResultIterator.java#L225]
>  ** Within [SnapshotScanner's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/SnapshotScanner.java#L180-L187]
>  * 
>  ** 
>  ***  HRegion#closeRegionOperation released the read lock and was successful
>  ***  HRegion#close which threw IOException due to memstore issue 
> (HBASE-28401)
>  ***  SnapshotScanner catches the IOException but doesn't set region field to 
> null
>  * TableSnapshotResultIterator's [finally block calls 
> ScanningResultIterator's close 
> method|https://github.com/apache/phoenix/blob/1e96a2756eaf0a2201a50579789190e8c10747df/phoenix-core-server/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L187-L190].
>  * 
>  ** 
>  *** *ScanningResultIterator's close is called again*
>  *** *Since region field wasn't null,* *HRegion#closeRegionOperation is 
> called again and throws IllegalMonitorStateException while trying to release 
> the read lock*
>  * 
>  ** 
>  *** The IllegalMonitorStateException then causes the whole mapper to fail
> It doesn't cause failure while doing snapshot reads via HBase (ref 
> HBASE-28743 where same NPE was observed but mapper still passes)
> , because the closest equivalent code (RecordReader within 
> TableSnapshotInputFormat) doesn't tries to close the region [as part of it's 
> nextKeyValue 
> method|https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L275-L280].
>   
> This is generally much safer [because record readers are always closed 
> explicitly (even if mapper's run method 
> fails)|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java#L466-L481]
> There are 2 improvements that can be done here : 
> 1. Disable mslab for region created within snapshot (by setting 
> hbase.hregion.memstore.mslab.enabled set to false)
> 2. In TableSnapshotResultIterator - Remove the the SnapshotScanner's close 
> (via ScanningResultIterator) called within next method. It would anyways be 
> closed by the mapper at the end



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7398) New PhoenixStatement API to return row for Atomic/Conditional Upserts

2024-08-29 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7398:
-

 Summary: New PhoenixStatement API to return row for 
Atomic/Conditional Upserts
 Key: PHOENIX-7398
 URL: https://issues.apache.org/jira/browse/PHOENIX-7398
 Project: Phoenix
  Issue Type: New Feature
Reporter: Viraj Jasani


Phoenix supports Atomic Conditional Upserts with ON DUPLICATE KEY clause, 
allowing clients to either ignore the update if the row with the given primary 
key(s) already exist or conditionally update the row with the given primary 
key(s). Phoenix also supports returning 0 or 1 as updated status code for the 
given Atomic/Conditional Upserts, where 0 represents no updates to the row and 
1 represents the updated row (when the condition is satisfied by the atomic 
upsert).

While standard JDBC APIs support Upserts with the status code returned as 
integer values 0 or 1, some client applications also require returning the 
current row state back to the client depending on whether the row is 
successfully updated. If the condition is satisfied by the atomic upsert and 
the row is updated, the client application should be able to get the updated 
row status as HBase Result object. Similarly, if the condition is not satisfied 
for the given atomic upsert and therefore, row is not updated, the client 
application should be able to get the old/current row status as HBase Result 
object.

This support is somewhat similar in nature to what HBase provides for Increment 
and Append mutations. Both operations return updated row as Result object. For 
instance,
{code:java}
/**
 * Appends values to one or more columns within a single row.
 * 
 * This operation guaranteed atomicity to readers. Appends are done under a 
single row lock, so
 * write operations to a row are synchronized, and readers are guaranteed to 
see this operation
 * fully completed.
 * @param append object that specifies the columns and values to be appended
 * @throws IOException e
 * @return values of columns after the append operation (maybe null)
 */
default Result append(final Append append) throws IOException {
{code}
 

HBase does not provide API to return row state for atomic updates with Put and 
Delete mutations. HBase API checkAndMutate() also supports returning Result for 
Append and Increment mutations only, not for Put and Delete mutations.

Phoenix supports batchMutate() with Put mutation(s) by making it Atomic. Hence, 
for client applications using Phoenix provided atomic upserts, it would be 
really beneficial to also provide new API at PhoenixStatement and 
PhoenixPreparedStatement layer that can return the row as Result object back to 
the client.

The proposed API signature:
{code:java}
/**
 * Executes the given SQL statement similar to JDBC API executeUpdate() but 
also returns the
 * updated or non-updated row as Result object back to the client. This must be 
used with
 * auto-commit Connection. This makes the operation atomic.
 * If the row is successfully updated, return the updated row, otherwise if the 
row
 * cannot be updated, return non-updated row.
 *
 * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
 * @return The pair of int and Result, where int represents value 1 for 
successful row update
 * and 0 for non-successful row update, and Result represents the state of the 
row.
 * @throws SQLException If the statement cannot be executed.
 */
public Pair executeUpdateReturnRow(String sql) throws 
SQLException {
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7398) New PhoenixStatement API to return row for Atomic/Conditional Upserts

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7398:
-

Assignee: Viraj Jasani

> New PhoenixStatement API to return row for Atomic/Conditional Upserts
> -
>
> Key: PHOENIX-7398
> URL: https://issues.apache.org/jira/browse/PHOENIX-7398
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Phoenix supports Atomic Conditional Upserts with ON DUPLICATE KEY clause, 
> allowing clients to either ignore the update if the row with the given 
> primary key(s) already exist or conditionally update the row with the given 
> primary key(s). Phoenix also supports returning 0 or 1 as updated status code 
> for the given Atomic/Conditional Upserts, where 0 represents no updates to 
> the row and 1 represents the updated row (when the condition is satisfied by 
> the atomic upsert).
> While standard JDBC APIs support Upserts with the status code returned as 
> integer values 0 or 1, some client applications also require returning the 
> current row state back to the client depending on whether the row is 
> successfully updated. If the condition is satisfied by the atomic upsert and 
> the row is updated, the client application should be able to get the updated 
> row status as HBase Result object. Similarly, if the condition is not 
> satisfied for the given atomic upsert and therefore, row is not updated, the 
> client application should be able to get the old/current row status as HBase 
> Result object.
> This support is somewhat similar in nature to what HBase provides for 
> Increment and Append mutations. Both operations return updated row as Result 
> object. For instance,
> {code:java}
> /**
>  * Appends values to one or more columns within a single row.
>  * 
>  * This operation guaranteed atomicity to readers. Appends are done under a 
> single row lock, so
>  * write operations to a row are synchronized, and readers are guaranteed to 
> see this operation
>  * fully completed.
>  * @param append object that specifies the columns and values to be appended
>  * @throws IOException e
>  * @return values of columns after the append operation (maybe null)
>  */
> default Result append(final Append append) throws IOException {
> {code}
>  
> HBase does not provide API to return row state for atomic updates with Put 
> and Delete mutations. HBase API checkAndMutate() also supports returning 
> Result for Append and Increment mutations only, not for Put and Delete 
> mutations.
> Phoenix supports batchMutate() with Put mutation(s) by making it Atomic. 
> Hence, for client applications using Phoenix provided atomic upserts, it 
> would be really beneficial to also provide new API at PhoenixStatement and 
> PhoenixPreparedStatement layer that can return the row as Result object back 
> to the client.
> The proposed API signature:
> {code:java}
> /**
>  * Executes the given SQL statement similar to JDBC API executeUpdate() but 
> also returns the
>  * updated or non-updated row as Result object back to the client. This must 
> be used with
>  * auto-commit Connection. This makes the operation atomic.
>  * If the row is successfully updated, return the updated row, otherwise if 
> the row
>  * cannot be updated, return non-updated row.
>  *
>  * @param sql The SQL DML statement, UPSERT or DELETE for Phoenix.
>  * @return The pair of int and Result, where int represents value 1 for 
> successful row update
>  * and 0 for non-successful row update, and Result represents the state of 
> the row.
>  * @throws SQLException If the statement cannot be executed.
>  */
> public Pair executeUpdateReturnRow(String sql) throws 
> SQLException {
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7395) Metadata Cache metrics at server and client side

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7395:
--
Description: 
Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
Guava cache) at both server and client side. The size of the cache at server 
side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
default value of 20 MB. Similarly, the size of the cache at client side is 
determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 MB.

To understand whether the size of the metadata caches at client and server side 
are sufficient for the given cluster and the give client, we need some 
visibility into how efficiently the caches are being utilized.

The purpose of this Jira is to add some metrics for both of these caches:
 * Used cache size
 * Cache hit count
 * Cache miss count
 * Cache eviction count
 * Cache removal count (explicit or replaced)
 * Cache add count (PTable objects added to the cache)

  was:
Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
Guava cache) at both server and client side. The size of the cache at server 
side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
default value of 20 MB. Similarly, the size of the cache at client side is 
determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 MB.

To understand whether the size of the metadata caches at client and server side 
are sufficient for the given cluster and the give client, we need some 
visibility into how efficiently the caches are being utilized.

The purpose of this Jira is to add some metrics for both of these caches:
 * Used cache size (in bytes)
 * Cache hit rates (per second)
 * Cache miss rates (per second)
 * Cache eviction rates (per second)


> Metadata Cache metrics at server and client side
> 
>
> Key: PHOENIX-7395
> URL: https://issues.apache.org/jira/browse/PHOENIX-7395
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
> Guava cache) at both server and client side. The size of the cache at server 
> side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
> default value of 20 MB. Similarly, the size of the cache at client side is 
> determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 
> MB.
> To understand whether the size of the metadata caches at client and server 
> side are sufficient for the given cluster and the give client, we need some 
> visibility into how efficiently the caches are being utilized.
> The purpose of this Jira is to add some metrics for both of these caches:
>  * Used cache size
>  * Cache hit count
>  * Cache miss count
>  * Cache eviction count
>  * Cache removal count (explicit or replaced)
>  * Cache add count (PTable objects added to the cache)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7395) Metadata Cache metrics at server and client side

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7395:
-

Assignee: Jing Yu

> Metadata Cache metrics at server and client side
> 
>
> Key: PHOENIX-7395
> URL: https://issues.apache.org/jira/browse/PHOENIX-7395
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
>
> Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
> Guava cache) at both server and client side. The size of the cache at server 
> side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
> default value of 20 MB. Similarly, the size of the cache at client side is 
> determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 
> MB.
> To understand whether the size of the metadata caches at client and server 
> side are sufficient for the given cluster and the give client, we need some 
> visibility into how efficiently the caches are being utilized.
> The purpose of this Jira is to add some metrics for both of these caches:
>  * Used cache size
>  * Cache hit count
>  * Cache miss count
>  * Cache eviction count
>  * Cache removal count (explicit or replaced)
>  * Cache add count (PTable objects added to the cache)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7386) Override UPDATE_CACHE_FREQUENCY if table has disabled indexes

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7386:
--
Fix Version/s: 5.1.4

> Override UPDATE_CACHE_FREQUENCY if table has disabled indexes
> -
>
> Key: PHOENIX-7386
> URL: https://issues.apache.org/jira/browse/PHOENIX-7386
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> If table has UPDATE_CACHE_FREQUENCY set to non-default i.e. anything other 
> than ALWAYS, depending on the value, PTable objects are cached until that 
> duration at the client side metadata cache. One of the common cases for 
> creating index is to create index in CREATE_DISABLE state.
> When any index is in DISABLE, CREATE_DISABLE or PENDING_ACTIVE state, Phoenix 
> client does not include PTable of the corresponding index as part of the 
> IndexMaintainer objects to server. Since indexes are expected to be either 
> dropped or in building state from the above disabled states, it is crucial 
> for Phoenix client to override the non-default value of 
> UPDATE_CACHE_FREQUENCY on the base table. This would help avoid any data 
> integrity issue because until the index state becomes BUILDING or ACTIVE, the 
> client will continue to override UPDATE_CACHE_FREQUENCY to default value to 
> let the UPSERT and/or SELECT queries to initiate getTable() RPC and override 
> the client side metadata cache value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7394) MaxPhoenixColumnSizeExceededException should not print rowkey

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7394.
---
Resolution: Fixed

> MaxPhoenixColumnSizeExceededException should not print rowkey
> -
>
> Key: PHOENIX-7394
> URL: https://issues.apache.org/jira/browse/PHOENIX-7394
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> PHOENIX-6167 introduced config "hbase.client.keyvalue.maxsize" with default 
> value 10 MB to limit the size of the column values while preparing the UPSERT 
> statements.
> For ONE_CELL_PER_COLUMN storage, if any column value is higher than 10 MB by 
> default, MaxPhoenixColumnSizeExceededException is thrown. However, the 
> exception should only provide table and column info, rowkey should not be 
> printed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7393) Update transitive dependency of woodstox-core to 5.4.0

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7393.
---
Resolution: Fixed

> Update transitive dependency of woodstox-core to 5.4.0
> --
>
> Key: PHOENIX-7393
> URL: https://issues.apache.org/jira/browse/PHOENIX-7393
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Grzegorz Kokosinski
>Assignee: Grzegorz Kokosinski
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> Exclude woodstox-core to fix [CVE-2022-40152 
> (|https://github.com/advisories/GHSA-3f7h-mf4q-vrm4] 
> [https://nvd.nist.gov/vuln/detail/CVE-2022-40152]).
> This is a transitive dependency from hadoop, it is most likely not needed for 
> phoenix. Notice that any product that is using {{phoenix-client-embedded}} to 
> use Phoenix internally, is flagged with this CVEs
> This is used in Trino phoenix connector. Then it makes entire Trino flagged 
> with this CVE.
> Update transitive dependency of woodstox-core to 5.4.0 fixes the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7393) Update transitive dependency of woodstox-core to 5.4.0

2024-08-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7393:
--
Fix Version/s: 5.2.1
   5.3.0
   5.1.4

> Update transitive dependency of woodstox-core to 5.4.0
> --
>
> Key: PHOENIX-7393
> URL: https://issues.apache.org/jira/browse/PHOENIX-7393
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Grzegorz Kokosinski
>Assignee: Grzegorz Kokosinski
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> Exclude woodstox-core to fix [CVE-2022-40152 
> (|https://github.com/advisories/GHSA-3f7h-mf4q-vrm4] 
> [https://nvd.nist.gov/vuln/detail/CVE-2022-40152]).
> This is a transitive dependency from hadoop, it is most likely not needed for 
> phoenix. Notice that any product that is using {{phoenix-client-embedded}} to 
> use Phoenix internally, is flagged with this CVEs
> This is used in Trino phoenix connector. Then it makes entire Trino flagged 
> with this CVE.
> Update transitive dependency of woodstox-core to 5.4.0 fixes the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7396) BSON_VALUE function to retrieve BSON field value with given data type

2024-08-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7396:
--
Fix Version/s: 5.3.0

> BSON_VALUE function to retrieve BSON field value with given data type
> -
>
> Key: PHOENIX-7396
> URL: https://issues.apache.org/jira/browse/PHOENIX-7396
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> The purpose of this Jira is to introduce new function for BSON data type to 
> retrieve the value of the given Bson Document field key. The function should 
> also take data type as an argument to decode the document field value.
> h2. *Function Grammar:*
> *Name:* BSON_VALUE
> *Arguments:*
> | |*Expression*|*DataType*|
> |1|Column Value|BSON|
> |2|Bson Field Key|The field key can represent any top level or nested fields 
> within the document. The caller should use "." notation for accessing nested 
> document elements and "[n]" notation for accessing nested array elements. 
> Unlike nested fields, top level document fields do not require any additional 
> character.|
> |3|SQL Data Type|The data type that the client expects the value of the field 
> to be converted to while returning the value.|
>  
> *Definition:* The function returns the value of the given field key from the 
> BSON Document. The client is expected to provide the data type that is used 
> for decoding the value of the field key.
> *Return Type:* PDataType (Depending on the third argument of the function, 
> the data type conversion takes place)
> *Examples:*
>  * BSON_VALUE(COL, 'topfield', 'DOUBLE')
>  * BSON_VALUE(COL, 'topfield.nestedfield1', 'VARCHAR')
>  * BSON_VALUE(COL, 'topfield.nestedfield[2]', 'INTEGER')
> Here, COL represents the column name of data type BSON.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7396) BSON_VALUE function to retrieve BSON field value with given data type

2024-08-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7396:
-

Assignee: Viraj Jasani

> BSON_VALUE function to retrieve BSON field value with given data type
> -
>
> Key: PHOENIX-7396
> URL: https://issues.apache.org/jira/browse/PHOENIX-7396
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> The purpose of this Jira is to introduce new function for BSON data type to 
> retrieve the value of the given Bson Document field key. The function should 
> also take data type as an argument to decode the document field value.
> h2. *Function Grammar:*
> *Name:* BSON_VALUE
> *Arguments:*
> | |*Expression*|*DataType*|
> |1|Column Value|BSON|
> |2|Bson Field Key|The field key can represent any top level or nested fields 
> within the document. The caller should use "." notation for accessing nested 
> document elements and "[n]" notation for accessing nested array elements. 
> Unlike nested fields, top level document fields do not require any additional 
> character.|
> |3|SQL Data Type|The data type that the client expects the value of the field 
> to be converted to while returning the value.|
>  
> *Definition:* The function returns the value of the given field key from the 
> BSON Document. The client is expected to provide the data type that is used 
> for decoding the value of the field key.
> *Return Type:* PDataType (Depending on the third argument of the function, 
> the data type conversion takes place)
> *Examples:*
>  * BSON_VALUE(COL, 'topfield', 'DOUBLE')
>  * BSON_VALUE(COL, 'topfield.nestedfield1', 'VARCHAR')
>  * BSON_VALUE(COL, 'topfield.nestedfield[2]', 'INTEGER')
> Here, COL represents the column name of data type BSON.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7396) BSON_VALUE function to retrieve BSON field value with given data type

2024-08-27 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7396:
-

 Summary: BSON_VALUE function to retrieve BSON field value with 
given data type
 Key: PHOENIX-7396
 URL: https://issues.apache.org/jira/browse/PHOENIX-7396
 Project: Phoenix
  Issue Type: New Feature
Reporter: Viraj Jasani


The purpose of this Jira is to introduce new function for BSON data type to 
retrieve the value of the given Bson Document field key. The function should 
also take data type as an argument to decode the document field value.
h2. *Function Grammar:*

*Name:* BSON_VALUE

*Arguments:*
| |*Expression*|*DataType*|
|1|Column Value|BSON|
|2|Bson Field Key|The field key can represent any top level or nested fields 
within the document. The caller should use "." notation for accessing nested 
document elements and "[n]" notation for accessing nested array elements. 
Unlike nested fields, top level document fields do not require any additional 
character.|
|3|SQL Data Type|The data type that the client expects the value of the field 
to be converted to while returning the value.|

 

*Definition:* The function returns the value of the given field key from the 
BSON Document. The client is expected to provide the data type that is used for 
decoding the value of the field key.

*Return Type:* PDataType (Depending on the third argument of the function, the 
data type conversion takes place)

*Examples:*
 * BSON_VALUE(COL, 'topfield', 'DOUBLE')
 * BSON_VALUE(COL, 'topfield.nestedfield1', 'VARCHAR')
 * BSON_VALUE(COL, 'topfield.nestedfield[2]', 'INTEGER')

Here, COL represents the column name of data type BSON.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7395) Metadata Cache metrics at server and client side

2024-08-27 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7395:
-

 Summary: Metadata Cache metrics at server and client side
 Key: PHOENIX-7395
 URL: https://issues.apache.org/jira/browse/PHOENIX-7395
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.2.0
Reporter: Viraj Jasani


Phoenix maintains cache of PTable objects, also known as Metadata cache (as a 
Guava cache) at both server and client side. The size of the cache at server 
side is determined by config _phoenix.coprocessor.maxMetaDataCacheSize_ with 
default value of 20 MB. Similarly, the size of the cache at client side is 
determined by _phoenix.client.maxMetaDataCacheSize_ with default value of 10 MB.

To understand whether the size of the metadata caches at client and server side 
are sufficient for the given cluster and the give client, we need some 
visibility into how efficiently the caches are being utilized.

The purpose of this Jira is to add some metrics for both of these caches:
 * Used cache size (in bytes)
 * Cache hit rates (per second)
 * Cache miss rates (per second)
 * Cache eviction rates (per second)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7382) Eliminating index building and treating max lookback as TTL for CDC Index

2024-08-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7382:
--
Description: 
PHOENIX-7001 introduced a Change Data Capture (CDC) feature by leveraging 
Phoenix Max Lookback and Uncovered Global Index on PHOENIX_ROW_TIMESTAMP(). The 
max lookback feature retains recent changes to a table and the uncovered index 
allows efficient retrieval of the changes in the order of their arrival.

Since the changes are retained only within the max lookback window, a CDC index 
does not need to have rows beyond the max lookback window. This means we can 
treat the max lookback age as the TTL for CDC indexes. 

When a CDC feature is enabled on a table, a CDC index is created with the 
building state, and like any other index, it is built from the data table. This 
means for every mutation on the data table, an index row mutation is built. 
However, the mutations beyond max lookback are not required for CDC. Since 
index built is an expensive background operation (it can take days for very 
large tables), eliminating index built is desirable when the CDC is enabled on 
a table. This means that the CDC feature will track changes that happen after 
it is enabled, which is the expected behavior. 

The code changes for this improvement will be for
 * Creating a CDC index with the active state 
 * Treating the max lookback age as the TTL for CDC indexes during compaction, 
and index verification, repair and rebuild.

  was:
Phoenix-7001 introduced a Change Data Capture (CDC) feature by leveraging 
Phoenix Max Lookback and Uncovered Global Index on PHOENIX_ROW_TIMESTAMP(). The 
max lookback feature retains recent changes to a table and the uncovered index 
allows efficient retrieval of the changes in the order of their arrival.

Since the changes are retained only within the max lookback window, a CDC index 
does not need to have rows beyond the max lookback window. This means we can 
treat the max lookback age as the TTL for CDC indexes. 

When a CDC feature is enabled on a table, a CDC index is created with the 
building state, and like any other index, it is built from the data table. This 
means for every mutation on the data table, an index row mutation is built. 
However, the mutations beyond max lookback are not required for CDC. Since 
index built is an expensive background operation (it can take days for very 
large tables), eliminating index built is desirable when the CDC is enabled on 
a table. This means that the CDC feature will track changes that happen after 
it is enabled, which is the expected behavior. 

The code changes for this improvement will be for
 * Creating a CDC index with the active state 
 * Treating the max lookback age as the TTL for CDC indexes during compaction, 
and index verification, repair and rebuild.


> Eliminating index building and treating max lookback as TTL for CDC Index
> -
>
> Key: PHOENIX-7382
> URL: https://issues.apache.org/jira/browse/PHOENIX-7382
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Kadir Ozdemir
>Assignee: Kadir Ozdemir
>Priority: Major
>
> PHOENIX-7001 introduced a Change Data Capture (CDC) feature by leveraging 
> Phoenix Max Lookback and Uncovered Global Index on PHOENIX_ROW_TIMESTAMP(). 
> The max lookback feature retains recent changes to a table and the uncovered 
> index allows efficient retrieval of the changes in the order of their arrival.
> Since the changes are retained only within the max lookback window, a CDC 
> index does not need to have rows beyond the max lookback window. This means 
> we can treat the max lookback age as the TTL for CDC indexes. 
> When a CDC feature is enabled on a table, a CDC index is created with the 
> building state, and like any other index, it is built from the data table. 
> This means for every mutation on the data table, an index row mutation is 
> built. However, the mutations beyond max lookback are not required for CDC. 
> Since index built is an expensive background operation (it can take days for 
> very large tables), eliminating index built is desirable when the CDC is 
> enabled on a table. This means that the CDC feature will track changes that 
> happen after it is enabled, which is the expected behavior. 
> The code changes for this improvement will be for
>  * Creating a CDC index with the active state 
>  * Treating the max lookback age as the TTL for CDC indexes during 
> compaction, and index verification, repair and rebuild.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7393) Exclude woodstox-core to fix CVE-2022-40152

2024-08-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7393:
-

Assignee: Grzegorz Kokosinski

> Exclude woodstox-core to fix CVE-2022-40152
> ---
>
> Key: PHOENIX-7393
> URL: https://issues.apache.org/jira/browse/PHOENIX-7393
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Grzegorz Kokosinski
>Assignee: Grzegorz Kokosinski
>Priority: Major
>
> Exclude woodstox-core to fix [CVE-2022-40152 
> (|https://github.com/advisories/GHSA-3f7h-mf4q-vrm4] 
> [https://nvd.nist.gov/vuln/detail/CVE-2022-40152]).
> This is a transitive dependency from hadoop, it is most likely not needed for 
> phoenix. Notice that any product that is using {{phoenix-client-embedded}} to 
> use Phoenix internally, is flagged with this CVEs
> This is used in Trino phoenix connector. Then it makes entire Trino flagged 
> with this CVE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7394) MaxPhoenixColumnSizeExceededException should not print rowkey

2024-08-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7394:
--
Fix Version/s: 5.2.1
   5.3.0
   5.1.4

> MaxPhoenixColumnSizeExceededException should not print rowkey
> -
>
> Key: PHOENIX-7394
> URL: https://issues.apache.org/jira/browse/PHOENIX-7394
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> PHOENIX-6167 introduced config "hbase.client.keyvalue.maxsize" with default 
> value 10 MB to limit the size of the column values while preparing the UPSERT 
> statements.
> For ONE_CELL_PER_COLUMN storage, if any column value is higher than 10 MB by 
> default, MaxPhoenixColumnSizeExceededException is thrown. However, the 
> exception should only provide table and column info, rowkey should not be 
> printed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7394) MaxPhoenixColumnSizeExceededException should not print rowkey

2024-08-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7394:
-

Assignee: Jing Yu

> MaxPhoenixColumnSizeExceededException should not print rowkey
> -
>
> Key: PHOENIX-7394
> URL: https://issues.apache.org/jira/browse/PHOENIX-7394
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> PHOENIX-6167 introduced config "hbase.client.keyvalue.maxsize" with default 
> value 10 MB to limit the size of the column values while preparing the UPSERT 
> statements.
> For ONE_CELL_PER_COLUMN storage, if any column value is higher than 10 MB by 
> default, MaxPhoenixColumnSizeExceededException is thrown. However, the 
> exception should only provide table and column info, rowkey should not be 
> printed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7394) MaxPhoenixColumnSizeExceededException should not print rowkey

2024-08-26 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7394:
-

 Summary: MaxPhoenixColumnSizeExceededException should not print 
rowkey
 Key: PHOENIX-7394
 URL: https://issues.apache.org/jira/browse/PHOENIX-7394
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.2.0
Reporter: Viraj Jasani


PHOENIX-6167 introduced config "hbase.client.keyvalue.maxsize" with default 
value 10 MB to limit the size of the column values while preparing the UPSERT 
statements.

For ONE_CELL_PER_COLUMN storage, if any column value is higher than 10 MB by 
default, MaxPhoenixColumnSizeExceededException is thrown. However, the 
exception should only provide table and column info, rowkey should not be 
printed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7375) CQSI connection init from regionserver hosting SYSTEM.CATALOG does not require RPC calls to system tables

2024-08-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7375.
---
Resolution: Fixed

> CQSI connection init from regionserver hosting SYSTEM.CATALOG does not 
> require RPC calls to system tables
> -
>
> Key: PHOENIX-7375
> URL: https://issues.apache.org/jira/browse/PHOENIX-7375
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> In order to execute any query at the server side, Phoenix client requires 
> creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
> initiates and maintains long lasting connection against HBase server. CQSI 
> connection is unique per JDBC url used to initiate the connection. Once 
> created, CQSI connections are cached per 24 hr (by default) for every unique 
> JDBC url provided.
> When client initiates CQSI connection, the connection initialization also 
> attempts to execute some metadata queries to ensure that the system tables 
> like SYSTEM.CATALOG exist and the client version is compatible against the 
> server version. For this, CQSI#init makes RPC calls against 
> MetaDataEndpointImpl coproc.
> This operation is valid for every CQSI connection initiated for every unique 
> JDBC url by every client. However, when server hosting SYSTEM.CATALOG 
> initiates CQSI connection, it means that SYSTEM.CATALOG and other system 
> tables already exist. Moreover, client/server version compatibility check is 
> not required because the connection is being created from the same server 
> that is hosting SYSTEM.CATALOG.
> Metadata operations performed by the regionserver hosting SYSTEM.CATALOG 
> region(s) also hold row level write lock for the given PTable entry. Hence, 
> this improvement is also expected to bring some perf improvement for the 
> metadata operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7375) CQSI connection init from regionserver hosting SYSTEM.CATALOG does not require RPC calls to system tables

2024-08-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7375:
--
Fix Version/s: 5.3.0

> CQSI connection init from regionserver hosting SYSTEM.CATALOG does not 
> require RPC calls to system tables
> -
>
> Key: PHOENIX-7375
> URL: https://issues.apache.org/jira/browse/PHOENIX-7375
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> In order to execute any query at the server side, Phoenix client requires 
> creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
> initiates and maintains long lasting connection against HBase server. CQSI 
> connection is unique per JDBC url used to initiate the connection. Once 
> created, CQSI connections are cached per 24 hr (by default) for every unique 
> JDBC url provided.
> When client initiates CQSI connection, the connection initialization also 
> attempts to execute some metadata queries to ensure that the system tables 
> like SYSTEM.CATALOG exist and the client version is compatible against the 
> server version. For this, CQSI#init makes RPC calls against 
> MetaDataEndpointImpl coproc.
> This operation is valid for every CQSI connection initiated for every unique 
> JDBC url by every client. However, when server hosting SYSTEM.CATALOG 
> initiates CQSI connection, it means that SYSTEM.CATALOG and other system 
> tables already exist. Moreover, client/server version compatibility check is 
> not required because the connection is being created from the same server 
> that is hosting SYSTEM.CATALOG.
> Metadata operations performed by the regionserver hosting SYSTEM.CATALOG 
> region(s) also hold row level write lock for the given PTable entry. Hence, 
> this improvement is also expected to bring some perf improvement for the 
> metadata operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7386) Override UPDATE_CACHE_FREQUENCY if table has disabled indexes

2024-08-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7386.
---
Resolution: Fixed

> Override UPDATE_CACHE_FREQUENCY if table has disabled indexes
> -
>
> Key: PHOENIX-7386
> URL: https://issues.apache.org/jira/browse/PHOENIX-7386
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> If table has UPDATE_CACHE_FREQUENCY set to non-default i.e. anything other 
> than ALWAYS, depending on the value, PTable objects are cached until that 
> duration at the client side metadata cache. One of the common cases for 
> creating index is to create index in CREATE_DISABLE state.
> When any index is in DISABLE, CREATE_DISABLE or PENDING_ACTIVE state, Phoenix 
> client does not include PTable of the corresponding index as part of the 
> IndexMaintainer objects to server. Since indexes are expected to be either 
> dropped or in building state from the above disabled states, it is crucial 
> for Phoenix client to override the non-default value of 
> UPDATE_CACHE_FREQUENCY on the base table. This would help avoid any data 
> integrity issue because until the index state becomes BUILDING or ACTIVE, the 
> client will continue to override UPDATE_CACHE_FREQUENCY to default value to 
> let the UPSERT and/or SELECT queries to initiate getTable() RPC and override 
> the client side metadata cache value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7386) Override UPDATE_CACHE_FREQUENCY if table has disabled indexes

2024-08-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7386:
--
Fix Version/s: 5.2.1
   5.3.0

> Override UPDATE_CACHE_FREQUENCY if table has disabled indexes
> -
>
> Key: PHOENIX-7386
> URL: https://issues.apache.org/jira/browse/PHOENIX-7386
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> If table has UPDATE_CACHE_FREQUENCY set to non-default i.e. anything other 
> than ALWAYS, depending on the value, PTable objects are cached until that 
> duration at the client side metadata cache. One of the common cases for 
> creating index is to create index in CREATE_DISABLE state.
> When any index is in DISABLE, CREATE_DISABLE or PENDING_ACTIVE state, Phoenix 
> client does not include PTable of the corresponding index as part of the 
> IndexMaintainer objects to server. Since indexes are expected to be either 
> dropped or in building state from the above disabled states, it is crucial 
> for Phoenix client to override the non-default value of 
> UPDATE_CACHE_FREQUENCY on the base table. This would help avoid any data 
> integrity issue because until the index state becomes BUILDING or ACTIVE, the 
> client will continue to override UPDATE_CACHE_FREQUENCY to default value to 
> let the UPSERT and/or SELECT queries to initiate getTable() RPC and override 
> the client side metadata cache value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7390) Phoenix metadata updates should fail-fast for noisy neighbor

2024-08-21 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7390.
---
Resolution: Duplicate

Dup of PHOENIX-7388

> Phoenix metadata updates should fail-fast for noisy neighbor
> 
>
> Key: PHOENIX-7390
> URL: https://issues.apache.org/jira/browse/PHOENIX-7390
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Viraj Jasani
>Priority: Major
>
> Phoenix is high scale, low latency, high throughput multi-tenant database. 
> The multi-tenancy can come with its own set of challenges, one of which is 
> noisy neighbour problem. Single client can initiate very high num of tenant 
> view updates (e.g. drop views, create views, create index etc) while all 
> other clients are making RPC calls to SYSTEM.CATALOG for retrieving the 
> updated PTable objects. With more metadata update calls, it is possible for 
> more RPC calls to get stuck while waiting for HBase RowLock to be acquired. 
> We have also seen high memory pressure with increasing num of metadata update 
> APIs.
>  
> HBase RowLock by default has 30s of timeout for acquiring lock, which is 
> configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable 
> at the cluster level, Phoenix metadata RPC calls are expected to have much 
> lower timeout value for the RowLock acquisition because metadata updates and 
> reads are expected to be extremely low latency operations. If this is not the 
> case, we are essentially blocking some client from getting either enough RPC 
> handlers to execute getTable RPC call or causing significant delays with 
> ongoing getTable calls.
> While HBASE-28797 has a proposal to introduce new Region API for acquiring 
> RowLock, Phoenix already has its own RowLock implementation and its already 
> being used by getTable RPC calls while protecting metadata server side cache 
> updates (PHOENIX-7363).
> The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
> metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7389) Phoenix metadata updates should fail-fast for noisy neighbor

2024-08-21 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7389.
---
Resolution: Duplicate

> Phoenix metadata updates should fail-fast for noisy neighbor
> 
>
> Key: PHOENIX-7389
> URL: https://issues.apache.org/jira/browse/PHOENIX-7389
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Viraj Jasani
>Priority: Major
>
> Phoenix is high scale, low latency, high throughput multi-tenant database. 
> The multi-tenancy can come with its own set of challenges, one of which is 
> noisy neighbour problem. Single client can initiate very high num of tenant 
> view updates (e.g. drop views, create views, create index etc) while all 
> other clients are making RPC calls to SYSTEM.CATALOG for retrieving the 
> updated PTable objects. With more metadata update calls, it is possible for 
> more RPC calls to get stuck while waiting for HBase RowLock to be acquired. 
> We have also seen high memory pressure with increasing num of metadata update 
> APIs.
>  
> HBase RowLock by default has 30s of timeout for acquiring lock, which is 
> configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable 
> at the cluster level, Phoenix metadata RPC calls are expected to have much 
> lower timeout value for the RowLock acquisition because metadata updates and 
> reads are expected to be extremely low latency operations. If this is not the 
> case, we are essentially blocking some client from getting either enough RPC 
> handlers to execute getTable RPC call or causing significant delays with 
> ongoing getTable calls.
> While HBASE-28797 has a proposal to introduce new Region API for acquiring 
> RowLock, Phoenix already has its own RowLock implementation and its already 
> being used by getTable RPC calls while protecting metadata server side cache 
> updates (PHOENIX-7363).
> The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
> metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7379) Improve handling of concurrent index mutations with the same timestamp

2024-08-20 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7379:
--
Fix Version/s: 5.3.0

> Improve handling of concurrent index mutations with the same timestamp
> --
>
> Key: PHOENIX-7379
> URL: https://issues.apache.org/jira/browse/PHOENIX-7379
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Kadir Ozdemir
>Assignee: Kadir Ozdemir
>Priority: Major
> Fix For: 5.3.0
>
>
> IndexRegionObserver after preparing the index updates just before releasing 
> row locks for a given batch checks if the current millisecond is the same as 
> the timestamp assigned for this batch. If so, its thread for this batch 
> sleeps for 1 ms so that the next batch of updates does not get the same 
> timestamp. Then, it releases the row locks. 
> This is done to prevent having two different mutations with the same 
> timestamp on the same row since the order of these mutations on the data 
> table and index cannot be guaranteed to be same. If the order is not the 
> same, then the data table and index will be inconsistent.
> One drawback of this approach is that if the index mutation preparation takes 
> less than 1 ms, then the data table mutation latency increases by 1 ms. Index 
> preparation takes more than 1 ms if IndexRegionObserver retrieves the current 
> state of the data table row from disk. However, if the row is cached in 
> memory, then the index preparation can easily take less than 1 ms. Also, 
> IndexRegionObserver does not need to retrieve the current row state for 
> uncovered indexes usually. For uncovered indexes, this logic almost always 
> adds 1 ms to the mutation latency.
> We can improve this by not sleeping proactively instead sleeping only when a 
> mutation on a row attempts to get the same timestamp of the previous mutation 
> on the same row.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7388) Metadata updates should fail-fast for noisy neighbor using Phoenix RowLock

2024-08-20 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7388:
--
Summary: Metadata updates should fail-fast for noisy neighbor using Phoenix 
RowLock  (was: Phoenix metadata updates should fail-fast for noisy neighbor)

> Metadata updates should fail-fast for noisy neighbor using Phoenix RowLock
> --
>
> Key: PHOENIX-7388
> URL: https://issues.apache.org/jira/browse/PHOENIX-7388
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Phoenix is high scale, low latency, high throughput multi-tenant database. 
> The multi-tenancy can come with its own set of challenges, one of which is 
> noisy neighbour problem. Single client can initiate very high num of tenant 
> view updates (e.g. drop views, create views, create index etc) while all 
> other clients are making RPC calls to SYSTEM.CATALOG for retrieving the 
> updated PTable objects. With more metadata update calls, it is possible for 
> more RPC calls to get stuck while waiting for HBase RowLock to be acquired. 
> We have also seen high memory pressure with increasing num of metadata update 
> APIs.
>  
> HBase RowLock by default has 30s of timeout for acquiring lock, which is 
> configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable 
> at the cluster level, Phoenix metadata RPC calls are expected to have much 
> lower timeout value for the RowLock acquisition because metadata updates and 
> reads are expected to be extremely low latency operations. If this is not the 
> case, we are essentially blocking some client from getting either enough RPC 
> handlers to execute getTable RPC call or causing significant delays with 
> ongoing getTable calls.
> While HBASE-28797 has a proposal to introduce new Region API for acquiring 
> RowLock, Phoenix already has its own RowLock implementation and its already 
> being used by getTable RPC calls while protecting metadata server side cache 
> updates (PHOENIX-7363).
> The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
> metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7388) Phoenix metadata updates should fail-fast for noisy neighbor

2024-08-20 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7388:
-

 Summary: Phoenix metadata updates should fail-fast for noisy 
neighbor
 Key: PHOENIX-7388
 URL: https://issues.apache.org/jira/browse/PHOENIX-7388
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.1.3, 5.2.0
Reporter: Viraj Jasani


Phoenix is high scale, low latency, high throughput multi-tenant database. The 
multi-tenancy can come with its own set of challenges, one of which is noisy 
neighbour problem. Single client can initiate very high num of tenant view 
updates (e.g. drop views, create views, create index etc) while all other 
clients are making RPC calls to SYSTEM.CATALOG for retrieving the updated 
PTable objects. With more metadata update calls, it is possible for more RPC 
calls to get stuck while waiting for HBase RowLock to be acquired. We have also 
seen high memory pressure with increasing num of metadata update APIs.

 

HBase RowLock by default has 30s of timeout for acquiring lock, which is 
configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable at 
the cluster level, Phoenix metadata RPC calls are expected to have much lower 
timeout value for the RowLock acquisition because metadata updates and reads 
are expected to be extremely low latency operations. If this is not the case, 
we are essentially blocking some client from getting either enough RPC handlers 
to execute getTable RPC call or causing significant delays with ongoing 
getTable calls.

While HBASE-28797 has a proposal to introduce new Region API for acquiring 
RowLock, Phoenix already has its own RowLock implementation and its already 
being used by getTable RPC calls while protecting metadata server side cache 
updates (PHOENIX-7363).

The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7389) Phoenix metadata updates should fail-fast for noisy neighbor

2024-08-20 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7389:
-

 Summary: Phoenix metadata updates should fail-fast for noisy 
neighbor
 Key: PHOENIX-7389
 URL: https://issues.apache.org/jira/browse/PHOENIX-7389
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.1.3, 5.2.0
Reporter: Viraj Jasani


Phoenix is high scale, low latency, high throughput multi-tenant database. The 
multi-tenancy can come with its own set of challenges, one of which is noisy 
neighbour problem. Single client can initiate very high num of tenant view 
updates (e.g. drop views, create views, create index etc) while all other 
clients are making RPC calls to SYSTEM.CATALOG for retrieving the updated 
PTable objects. With more metadata update calls, it is possible for more RPC 
calls to get stuck while waiting for HBase RowLock to be acquired. We have also 
seen high memory pressure with increasing num of metadata update APIs.

 

HBase RowLock by default has 30s of timeout for acquiring lock, which is 
configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable at 
the cluster level, Phoenix metadata RPC calls are expected to have much lower 
timeout value for the RowLock acquisition because metadata updates and reads 
are expected to be extremely low latency operations. If this is not the case, 
we are essentially blocking some client from getting either enough RPC handlers 
to execute getTable RPC call or causing significant delays with ongoing 
getTable calls.

While HBASE-28797 has a proposal to introduce new Region API for acquiring 
RowLock, Phoenix already has its own RowLock implementation and its already 
being used by getTable RPC calls while protecting metadata server side cache 
updates (PHOENIX-7363).

The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7390) Phoenix metadata updates should fail-fast for noisy neighbor

2024-08-20 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7390:
-

 Summary: Phoenix metadata updates should fail-fast for noisy 
neighbor
 Key: PHOENIX-7390
 URL: https://issues.apache.org/jira/browse/PHOENIX-7390
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.1.3, 5.2.0
Reporter: Viraj Jasani


Phoenix is high scale, low latency, high throughput multi-tenant database. The 
multi-tenancy can come with its own set of challenges, one of which is noisy 
neighbour problem. Single client can initiate very high num of tenant view 
updates (e.g. drop views, create views, create index etc) while all other 
clients are making RPC calls to SYSTEM.CATALOG for retrieving the updated 
PTable objects. With more metadata update calls, it is possible for more RPC 
calls to get stuck while waiting for HBase RowLock to be acquired. We have also 
seen high memory pressure with increasing num of metadata update APIs.

 

HBase RowLock by default has 30s of timeout for acquiring lock, which is 
configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable at 
the cluster level, Phoenix metadata RPC calls are expected to have much lower 
timeout value for the RowLock acquisition because metadata updates and reads 
are expected to be extremely low latency operations. If this is not the case, 
we are essentially blocking some client from getting either enough RPC handlers 
to execute getTable RPC call or causing significant delays with ongoing 
getTable calls.

While HBASE-28797 has a proposal to introduce new Region API for acquiring 
RowLock, Phoenix already has its own RowLock implementation and its already 
being used by getTable RPC calls while protecting metadata server side cache 
updates (PHOENIX-7363).

The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7388) Phoenix metadata updates should fail-fast for noisy neighbor

2024-08-20 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7388:
-

Assignee: Viraj Jasani

> Phoenix metadata updates should fail-fast for noisy neighbor
> 
>
> Key: PHOENIX-7388
> URL: https://issues.apache.org/jira/browse/PHOENIX-7388
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0, 5.1.3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Phoenix is high scale, low latency, high throughput multi-tenant database. 
> The multi-tenancy can come with its own set of challenges, one of which is 
> noisy neighbour problem. Single client can initiate very high num of tenant 
> view updates (e.g. drop views, create views, create index etc) while all 
> other clients are making RPC calls to SYSTEM.CATALOG for retrieving the 
> updated PTable objects. With more metadata update calls, it is possible for 
> more RPC calls to get stuck while waiting for HBase RowLock to be acquired. 
> We have also seen high memory pressure with increasing num of metadata update 
> APIs.
>  
> HBase RowLock by default has 30s of timeout for acquiring lock, which is 
> configurable by {_}hbase.rowlock.wait.duration{_}. While this is applicable 
> at the cluster level, Phoenix metadata RPC calls are expected to have much 
> lower timeout value for the RowLock acquisition because metadata updates and 
> reads are expected to be extremely low latency operations. If this is not the 
> case, we are essentially blocking some client from getting either enough RPC 
> handlers to execute getTable RPC call or causing significant delays with 
> ongoing getTable calls.
> While HBASE-28797 has a proposal to introduce new Region API for acquiring 
> RowLock, Phoenix already has its own RowLock implementation and its already 
> being used by getTable RPC calls while protecting metadata server side cache 
> updates (PHOENIX-7363).
> The proposal of this Jira is to eliminate using HBase RowLock for all Phoenix 
> metadata operations and use Phoenix RowLock with default timeout of 3 sec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7386) Override UPDATE_CACHE_FREQUENCY if table has disabled indexes

2024-08-15 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7386:
-

Assignee: Viraj Jasani

> Override UPDATE_CACHE_FREQUENCY if table has disabled indexes
> -
>
> Key: PHOENIX-7386
> URL: https://issues.apache.org/jira/browse/PHOENIX-7386
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> If table has UPDATE_CACHE_FREQUENCY set to non-default i.e. anything other 
> than ALWAYS, depending on the value, PTable objects are cached until that 
> duration at the client side metadata cache. One of the common cases for 
> creating index is to create index in CREATE_DISABLE state.
> When any index is in DISABLE, CREATE_DISABLE or PENDING_ACTIVE state, Phoenix 
> client does not include PTable of the corresponding index as part of the 
> IndexMaintainer objects to server. Since indexes are expected to be either 
> dropped or in building state from the above disabled states, it is crucial 
> for Phoenix client to override the non-default value of 
> UPDATE_CACHE_FREQUENCY on the base table. This would help avoid any data 
> integrity issue because until the index state becomes BUILDING or ACTIVE, the 
> client will continue to override UPDATE_CACHE_FREQUENCY to default value to 
> let the UPSERT and/or SELECT queries to initiate getTable() RPC and override 
> the client side metadata cache value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7386) Override UPDATE_CACHE_FREQUENCY if table has disabled indexes

2024-08-15 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7386:
-

 Summary: Override UPDATE_CACHE_FREQUENCY if table has disabled 
indexes
 Key: PHOENIX-7386
 URL: https://issues.apache.org/jira/browse/PHOENIX-7386
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.2.0
Reporter: Viraj Jasani


If table has UPDATE_CACHE_FREQUENCY set to non-default i.e. anything other than 
ALWAYS, depending on the value, PTable objects are cached until that duration 
at the client side metadata cache. One of the common cases for creating index 
is to create index in CREATE_DISABLE state.

When any index is in DISABLE, CREATE_DISABLE or PENDING_ACTIVE state, Phoenix 
client does not include PTable of the corresponding index as part of the 
IndexMaintainer objects to server. Since indexes are expected to be either 
dropped or in building state from the above disabled states, it is crucial for 
Phoenix client to override the non-default value of UPDATE_CACHE_FREQUENCY on 
the base table. This would help avoid any data integrity issue because until 
the index state becomes BUILDING or ACTIVE, the client will continue to 
override UPDATE_CACHE_FREQUENCY to default value to let the UPSERT and/or 
SELECT queries to initiate getTable() RPC and override the client side metadata 
cache value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-4555) Only mark view as updatable if rows cannot overlap with other updatable views

2024-08-14 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-4555.
---
Resolution: Fixed

> Only mark view as updatable if rows cannot overlap with other updatable views
> -
>
> Key: PHOENIX-4555
> URL: https://issues.apache.org/jira/browse/PHOENIX-4555
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James R. Taylor
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.3.0
>
>
> We'll run into issues if updatable sibling views overlap with each other. For 
> example, say you have the following hierarchy:
> T (A, B, C)
> V1 (D, E) FROM T WHERE A = 1
> V2 (F, G) FROM T WHERE A = 1 and B = 2
> In this case, there's no way to update both V1 and v2 columns. Secondary 
> indexes wouldn't work either, if you had one on each V1 & V2. 
> We should restrict updatable views to
> - views that filter on PK column(s)
> - sibling views filter on same set of PK column(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-4555) Only mark view as updatable if rows cannot overlap with other updatable views

2024-08-14 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-4555:
--
Fix Version/s: 5.3.0

> Only mark view as updatable if rows cannot overlap with other updatable views
> -
>
> Key: PHOENIX-4555
> URL: https://issues.apache.org/jira/browse/PHOENIX-4555
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James R. Taylor
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.3.0
>
>
> We'll run into issues if updatable sibling views overlap with each other. For 
> example, say you have the following hierarchy:
> T (A, B, C)
> V1 (D, E) FROM T WHERE A = 1
> V2 (F, G) FROM T WHERE A = 1 and B = 2
> In this case, there's no way to update both V1 and v2 columns. Secondary 
> indexes wouldn't work either, if you had one on each V1 & V2. 
> We should restrict updatable views to
> - views that filter on PK column(s)
> - sibling views filter on same set of PK column(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7330) Introducing Binary JSON (BSON) with Complex Document structures in Phoenix

2024-08-12 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7330.
---
Resolution: Fixed

> Introducing Binary JSON (BSON) with Complex Document structures in Phoenix
> --
>
> Key: PHOENIX-7330
> URL: https://issues.apache.org/jira/browse/PHOENIX-7330
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> The purpose of this Jira is to introduce new data type in Phoenix: Binary 
> JSON (BSON) to manage more complex document data structures in Phoenix.
> BSON or Binary JSON is a Binary-Encoded serialization of JSON-like documents. 
> BSON data type is specifically used for users to store, update and query part 
> or whole of the BsonDocument in the most performant way without having to 
> serialize/deserialize the document to/from binary format. Bson allows 
> deserializing only part of the nested documents such that querying or 
> indexing any attributes within the nested structure becomes more efficient 
> and performant as the deserialization happens at runtime. Any other document 
> structure would require deserializing the binary into the document, and then 
> perform the query.
> BSONSpec: [https://bsonspec.org/]
> JSON and BSON are closely related by design. BSON serves as a binary 
> representation of JSON data, tailored with specialized extensions for wider 
> application scenarios, and finely tuned for efficient data storage and 
> traversal. Similar to JSON, BSON facilitates the embedding of objects and 
> arrays.
> One particular way in which BSON differs from JSON is in its support for some 
> more advanced data types. For instance, JSON does not differentiate between 
> integers (round numbers), and floating-point numbers (with decimal 
> precision). BSON does distinguish between the two and store them in the 
> corresponding BSON data type (e.g. BsonInt32 vs BsonDouble). Many server-side 
> programming languages offer advanced numeric data types (standards include 
> integer, regular precision floating point number i.e. “float”, 
> double-precision floating point i.e. “double”, and boolean values), each with 
> its own optimal usage for efficient mathematical operations.
> Another key distinction between BSON and JSON is that BSON documents have the 
> capability to include Date or Binary objects, which cannot be directly 
> represented in pure JSON format. BSON also provides the ability to store and 
> retrieve user defined Binary objects. Likewise, by integrating advanced data 
> structures like Sets into BSON documents, we can significantly enhance the 
> capabilities of Phoenix for storing, retrieving, and updating Binary, Sets, 
> Lists, and Documents as nested or complex data types.
> Moreover, JSON format is human as well as machine readable, whereas BSON 
> format is only machine readable. Hence, as part of introducing BSON data 
> type, we also need to provide a user interface such that users can provide 
> human readable JSON as input for BSON datatype.
> This Jira also introduces access and update functions for BSON documents.
> BSON_CONDITION_EXPRESSION can evaluate condition expression on the document 
> fields, similar to how WHERE clause evaluates condition expression on various 
> columns of the given row(s) for the relational tables.
> BSON_UPDATE_EXPRESSION can perform one or more document field updates similar 
> to how UPSERT statements can perform update to one or more columns of the 
> given row(s) for the relational tables.
>  
> Phoenix can introduce more complex data structures like sets of scalar types, 
> in addition to the nested documents and nested arrays provided by BSON.
> Overall, by combining various functionalities available in Phoenix like 
> secondary indexes, conditional updates, high throughput read/write with BSON, 
> we can evolve Phoenix into highly scalable Document Database.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7375) CQSI connection init from regionserver hosting SYSTEM.CATALOG does not require RPC calls to system tables

2024-08-06 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7375:
--
Description: 
In order to execute any query at the server side, Phoenix client requires 
creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
initiates and maintains long lasting connection against HBase server. CQSI 
connection is unique per JDBC url used to initiate the connection. Once 
created, CQSI connections are cached per 24 hr (by default) for every unique 
JDBC url provided.

When client initiates CQSI connection, the connection initialization also 
attempts to execute some metadata queries to ensure that the system tables like 
SYSTEM.CATALOG exist and the client version is compatible against the server 
version. For this, CQSI#init makes RPC calls against MetaDataEndpointImpl 
coproc.

This operation is valid for every CQSI connection initiated for every unique 
JDBC url by every client. However, when server hosting SYSTEM.CATALOG initiates 
CQSI connection, it means that SYSTEM.CATALOG and other system tables already 
exist. Moreover, client/server version compatibility check is not required 
because the connection is being created from the same server that is hosting 
SYSTEM.CATALOG.

Metadata operations performed by the regionserver hosting SYSTEM.CATALOG 
region(s) also hold row level write lock for the given PTable entry. Hence, 
this improvement is also expected to bring some perf improvement for the 
metadata operations.

  was:
In order to execute any query at the server side, Phoenix client requires 
creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
initiates and maintains long lasting connection against HBase server. CQSI 
connection is unique per JDBC url used to initiate the connection. Once 
created, CQSI connections are cached per 24 hr (by default) for every unique 
JDBC url provided.

When client initiates CQSI connection, the connection initialization also 
attempts to execute some metadata queries to ensure that the system tables like 
SYSTEM.CATALOG exist and the client version is compatible against the server 
version. For this, CQSI#init makes RPC calls against MetaDataEndpointImpl 
coproc.

This operation is valid for every CQSI connection initiated for every unique 
JDBC url by every client. However, when server hosting SYSTEM.CATALOG initiates 
CQSI connection, it means that SYSTEM.CATALOG and other system tables already 
exist. Moreover, client/server version compatibility check is not required 
because the connection is being created from the same server that is hosting 
SYSTEM.CATALOG.


> CQSI connection init from regionserver hosting SYSTEM.CATALOG does not 
> require RPC calls to system tables
> -
>
> Key: PHOENIX-7375
> URL: https://issues.apache.org/jira/browse/PHOENIX-7375
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> In order to execute any query at the server side, Phoenix client requires 
> creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
> initiates and maintains long lasting connection against HBase server. CQSI 
> connection is unique per JDBC url used to initiate the connection. Once 
> created, CQSI connections are cached per 24 hr (by default) for every unique 
> JDBC url provided.
> When client initiates CQSI connection, the connection initialization also 
> attempts to execute some metadata queries to ensure that the system tables 
> like SYSTEM.CATALOG exist and the client version is compatible against the 
> server version. For this, CQSI#init makes RPC calls against 
> MetaDataEndpointImpl coproc.
> This operation is valid for every CQSI connection initiated for every unique 
> JDBC url by every client. However, when server hosting SYSTEM.CATALOG 
> initiates CQSI connection, it means that SYSTEM.CATALOG and other system 
> tables already exist. Moreover, client/server version compatibility check is 
> not required because the connection is being created from the same server 
> that is hosting SYSTEM.CATALOG.
> Metadata operations performed by the regionserver hosting SYSTEM.CATALOG 
> region(s) also hold row level write lock for the given PTable entry. Hence, 
> this improvement is also expected to bring some perf improvement for the 
> metadata operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7375) CQSI connection init from regionserver hosting SYSTEM.CATALOG does not require RPC calls to system tables

2024-08-06 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7375:
--
Summary: CQSI connection init from regionserver hosting SYSTEM.CATALOG does 
not require RPC calls to system tables  (was: CQSI connection initialization 
from SYSTEM.CATALOG regionserver does not require RPC calls to system tables)

> CQSI connection init from regionserver hosting SYSTEM.CATALOG does not 
> require RPC calls to system tables
> -
>
> Key: PHOENIX-7375
> URL: https://issues.apache.org/jira/browse/PHOENIX-7375
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> In order to execute any query at the server side, Phoenix client requires 
> creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
> initiates and maintains long lasting connection against HBase server. CQSI 
> connection is unique per JDBC url used to initiate the connection. Once 
> created, CQSI connections are cached per 24 hr (by default) for every unique 
> JDBC url provided.
> When client initiates CQSI connection, the connection initialization also 
> attempts to execute some metadata queries to ensure that the system tables 
> like SYSTEM.CATALOG exist and the client version is compatible against the 
> server version. For this, CQSI#init makes RPC calls against 
> MetaDataEndpointImpl coproc.
> This operation is valid for every CQSI connection initiated for every unique 
> JDBC url by every client. However, when server hosting SYSTEM.CATALOG 
> initiates CQSI connection, it means that SYSTEM.CATALOG and other system 
> tables already exist. Moreover, client/server version compatibility check is 
> not required because the connection is being created from the same server 
> that is hosting SYSTEM.CATALOG.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7330) Introducing Binary JSON (BSON) with Complex Document structures in Phoenix

2024-08-03 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7330:
--
Fix Version/s: 5.3.0

> Introducing Binary JSON (BSON) with Complex Document structures in Phoenix
> --
>
> Key: PHOENIX-7330
> URL: https://issues.apache.org/jira/browse/PHOENIX-7330
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.3.0
>
>
> The purpose of this Jira is to introduce new data type in Phoenix: Binary 
> JSON (BSON) to manage more complex document data structures in Phoenix.
> BSON or Binary JSON is a Binary-Encoded serialization of JSON-like documents. 
> BSON data type is specifically used for users to store, update and query part 
> or whole of the BsonDocument in the most performant way without having to 
> serialize/deserialize the document to/from binary format. Bson allows 
> deserializing only part of the nested documents such that querying or 
> indexing any attributes within the nested structure becomes more efficient 
> and performant as the deserialization happens at runtime. Any other document 
> structure would require deserializing the binary into the document, and then 
> perform the query.
> BSONSpec: [https://bsonspec.org/]
> JSON and BSON are closely related by design. BSON serves as a binary 
> representation of JSON data, tailored with specialized extensions for wider 
> application scenarios, and finely tuned for efficient data storage and 
> traversal. Similar to JSON, BSON facilitates the embedding of objects and 
> arrays.
> One particular way in which BSON differs from JSON is in its support for some 
> more advanced data types. For instance, JSON does not differentiate between 
> integers (round numbers), and floating-point numbers (with decimal 
> precision). BSON does distinguish between the two and store them in the 
> corresponding BSON data type (e.g. BsonInt32 vs BsonDouble). Many server-side 
> programming languages offer advanced numeric data types (standards include 
> integer, regular precision floating point number i.e. “float”, 
> double-precision floating point i.e. “double”, and boolean values), each with 
> its own optimal usage for efficient mathematical operations.
> Another key distinction between BSON and JSON is that BSON documents have the 
> capability to include Date or Binary objects, which cannot be directly 
> represented in pure JSON format. BSON also provides the ability to store and 
> retrieve user defined Binary objects. Likewise, by integrating advanced data 
> structures like Sets into BSON documents, we can significantly enhance the 
> capabilities of Phoenix for storing, retrieving, and updating Binary, Sets, 
> Lists, and Documents as nested or complex data types.
> Moreover, JSON format is human as well as machine readable, whereas BSON 
> format is only machine readable. Hence, as part of introducing BSON data 
> type, we also need to provide a user interface such that users can provide 
> human readable JSON as input for BSON datatype.
> This Jira also introduces access and update functions for BSON documents.
> BSON_CONDITION_EXPRESSION can evaluate condition expression on the document 
> fields, similar to how WHERE clause evaluates condition expression on various 
> columns of the given row(s) for the relational tables.
> BSON_UPDATE_EXPRESSION can perform one or more document field updates similar 
> to how UPSERT statements can perform update to one or more columns of the 
> given row(s) for the relational tables.
>  
> Phoenix can introduce more complex data structures like sets of scalar types, 
> in addition to the nested documents and nested arrays provided by BSON.
> Overall, by combining various functionalities available in Phoenix like 
> secondary indexes, conditional updates, high throughput read/write with BSON, 
> we can evolve Phoenix into highly scalable Document Database.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7363) Protect server side metadata cache updates for the given PTable

2024-07-30 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Release Note: 
PHOENIX-6066 introduces a way for us to take HBase read level row-lock while 
retrieving the PTable object as part of the getTable() RPC call, by default. 
Before PHOENIX-6066, only write level row-lock was used, which hurts the 
performance even if the server side metadata cache has latest data, requiring 
no lookup from SYSTEM.CATALOG table.

PHOENIX-7363 allows to protect the metadata cache update at the server side 
with Phoenix write level row-lock. As part of getTable() call, we already must 
be holding HBase read level row-lock. Hence, PHOENIX-7363 provides protection 
for server side metadata cache updates.

PHOENIX-6066 and PHOENIX-7363 must be combined.

  was:
PHOENIX-6066 introduces a way for us to take HBase read level row-lock while 
retrieving the PTable object as part of the getTable() RPC call, by default. 
Before PHOENIX-6066, only write level row-lock was used.

PHOENIX-7363 allows to protect the metadata cache update at the server side 
with Phoenix write level row-lock. As part of getTable() call, we already must 
be holding HBase read level row-lock. Hence, PHOENIX-7363 provides protection 
for server side metadata cache updates.

PHOENIX-6066 and PHOENIX-7363 must be combined.


> Protect server side metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apach

[jira] [Updated] (PHOENIX-7363) Protect server side metadata cache updates for the given PTable

2024-07-30 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Release Note: 
PHOENIX-6066 introduces a way for us to take HBase read level row-lock while 
retrieving the PTable object as part of the getTable() RPC call, by default. 
Before PHOENIX-6066, only write level row-lock was used.

PHOENIX-7363 allows to protect the metadata cache update at the server side 
with Phoenix write level row-lock. As part of getTable() call, we already must 
be holding HBase read level row-lock. Hence, PHOENIX-7363 provides protection 
for server side metadata cache updates.

PHOENIX-6066 and PHOENIX-7363 must be combined.

  was:PHOENIX-6066 and PHOENIX-7363 must be combined.


> Protect server side metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.

[jira] [Resolved] (PHOENIX-7363) Protect server side metadata cache updates for the given PTable

2024-07-28 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7363.
---
Resolution: Fixed

> Protect server side metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
> 

[jira] [Updated] (PHOENIX-7363) Protect server side metadata cache updates for the given PTable

2024-07-28 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Release Note: PHOENIX-6066 and PHOENIX-7363 must be combined.  (was: 
PHOENIX-6066 and PHOENIX-7363 should be combined.)

> Protect server side metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServer

[jira] [Updated] (PHOENIX-7363) Protect server side metadata cache updates for the given PTable

2024-07-28 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Release Note: PHOENIX-6066 and PHOENIX-7363 should be combined.

> Protect server side metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>     at org.apache.hadoop.hbas

[jira] [Updated] (PHOENIX-7370) Server to server system table RPC calls should use separate RPC handler pool

2024-07-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7370:
--
Priority: Critical  (was: Major)

> Server to server system table RPC calls should use separate RPC handler pool
> 
>
> Key: PHOENIX-7370
> URL: https://issues.apache.org/jira/browse/PHOENIX-7370
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Critical
>
> HBase uses RPC (Remote Procedure Call) framework for all the wire 
> communication among its components e.g. client to server (client to master 
> daemon or client to regionservers) as well as server to server (master to 
> regionserver, regionserver to regionserver) communication. HBase RPC uses 
> Google's Protocol Buffers (protobuf) for defining the structure of messages 
> sent between clients and servers. Protocol Buffers allow efficient 
> serialization and deserialization of data, which is crucial for performance. 
> HBase defines service interfaces using Protocol Buffers, which outline the 
> operations that clients can request from HBase servers. These interfaces 
> define methods like get, put, scan, etc., that clients use to interact with 
> the database.
> HBase also provides Coprocessors. HBase Coprocessors are used to extend the 
> regionservers functionalities. They allow custom code to execute within the 
> context of the regionserver during specific phases of the given workflow, 
> such as during data reads (preScan, postScan etc), writes (preBatchMutate, 
> postBatchMutate etc), region splits or even at the start or end of 
> regionserver operations. In addition to being SQL query engine, Phoenix is 
> also a Coprocessor component. RPC framework using Protobuf is used to define 
> how coprocessor endpoints communicate between clients and the coprocessors 
> running on the regionservers.
> Phoenix client creates CQSI connection ({{{}ConnectionQueryServices{}}}), 
> which maintains long time TCP connection with HBase server, usually knowns as 
> {{HConnection}} or HBase Connection. Once the connection is created, it is 
> cached by the Phoenix client.
> While PHOENIX-6066 is considered the correct fix to improve the query 
> performance, releasing it has surfaced other issues related to RPC framework. 
> One of the issues surfaced caused deadlock for SYSTEM.CATALOG serving 
> regionserver as it could not make any more progress because all handler 
> threads serving RPC calls for Phoenix system tables (thread pool: 
> {{{}RpcServer.Metadata.Fifo.handler{}}}) got exhausted while creating server 
> side connection from the given regionserver.
> Several workflows from MetaDataEndpointImpl coproc requires Phoenix 
> connection, which is usually CQSI connection. Phoenix differentiates CQSI 
> connections initiated by clients and servers by using a property: 
> {{{}IS_SERVER_CONNECTION{}}}.
> For CQSI connections created by servers, IS_SERVER_CONNECTION is kept true.
> Under heavy load, when several clients execute getTable() calls for the same 
> base table simultaneously, MetaDataEndpointImpl coproc attempts to create 
> server side CQSI connection initially. As CQSI initialization also depends on 
> Phoenix system tables existence check as well as client to server version 
> compatibility checks, it also performs MetaDataEndpointImpl#getVersion() RPC 
> call which is meant to be served by RpcServer.Metadata.Fifo.handler 
> thread-pool. However, under heavy load, the thread-pool can be completely 
> occupied if all getTable() calls tries to initiate CQSI connection, whereas 
> only single thread can take global CQSI lock to initiate HBase Connection 
> before caching CQSI connection for other threads to use. This has potential 
> to create deadlock.
> h3. Solutions:
>  * Phoenix server to server system table RPC calls are supposed to be using 
> separate handler thread-pools (PHOENIX-6687). However, this is not correctly 
> working because regardless of whether the HBase Connection is initiated by 
> client or server, Phoenix only provides ClientRpcControllerFactory by 
> default. We need to provide separate RpcControllerFactory during HBase 
> Connection initialization done by Coprocessors that operate on regionservers.
>  * For Phoenix server creating CQSI connection, we do not need to check for 
> existence of system tables as well as client-server version compatibility. 
> This redundant RPC call can be avoided.
>  
> Doc on HBase/Phoenix RPC Scheduler Framework: 
> https://docs.google.com/document/d/12SzcAY3mJVsN0naMnq45qsHcUIk1CzHsAI0EOi6IIgg/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7376) ViewUtil#findAllDescendantViews should provide two versions to differentiate CQSI initiated by clients and servers

2024-07-27 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7376:
-

 Summary: ViewUtil#findAllDescendantViews should provide two 
versions to differentiate CQSI initiated by clients and servers
 Key: PHOENIX-7376
 URL: https://issues.apache.org/jira/browse/PHOENIX-7376
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.2.0
Reporter: Viraj Jasani


ViewUtil#findAllDescendantViews provides ability to retrieve all the descendant 
views of a given table or view by scanning the hierarchy of parent to child in 
the depth-first fashion. While this utility was initially built for coprocessor 
endpoints, it creates CQSI connection using with server connection property: 
IS_SERVER_CONNECTION.

While we don't have server connection specific logic, we need to provide 
separate RPC handler pools for server to server RPC calls for System tables 
(PHOENIX-7370). In order to properly differentiate connections being created 
from client to server vs server to server, ViewUtil#findAllDescendantViews() 
should provide two flavors - one to be used by client to server connections.

For instance, PHOENIX-7067 and PHOENIX-4555 should use client version of 
ViewUtil#findAllDescendantViews() whereas all MetaDataEndpointImpl should use 
server version of ViewUtil#findAllDescendantViews().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7376) ViewUtil#findAllDescendantViews should provide two versions to differentiate CQSI initiated by clients and servers

2024-07-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7376:
-

Assignee: Viraj Jasani

> ViewUtil#findAllDescendantViews should provide two versions to differentiate 
> CQSI initiated by clients and servers
> --
>
> Key: PHOENIX-7376
> URL: https://issues.apache.org/jira/browse/PHOENIX-7376
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> ViewUtil#findAllDescendantViews provides ability to retrieve all the 
> descendant views of a given table or view by scanning the hierarchy of parent 
> to child in the depth-first fashion. While this utility was initially built 
> for coprocessor endpoints, it creates CQSI connection using with server 
> connection property: IS_SERVER_CONNECTION.
> While we don't have server connection specific logic, we need to provide 
> separate RPC handler pools for server to server RPC calls for System tables 
> (PHOENIX-7370). In order to properly differentiate connections being created 
> from client to server vs server to server, ViewUtil#findAllDescendantViews() 
> should provide two flavors - one to be used by client to server connections.
> For instance, PHOENIX-7067 and PHOENIX-4555 should use client version of 
> ViewUtil#findAllDescendantViews() whereas all MetaDataEndpointImpl should use 
> server version of ViewUtil#findAllDescendantViews().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7375) CQSI connection initialization from SYSTEM.CATALOG regionserver does not require RPC calls to system tables

2024-07-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7375:
-

Assignee: Viraj Jasani

> CQSI connection initialization from SYSTEM.CATALOG regionserver does not 
> require RPC calls to system tables
> ---
>
> Key: PHOENIX-7375
> URL: https://issues.apache.org/jira/browse/PHOENIX-7375
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> In order to execute any query at the server side, Phoenix client requires 
> creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
> initiates and maintains long lasting connection against HBase server. CQSI 
> connection is unique per JDBC url used to initiate the connection. Once 
> created, CQSI connections are cached per 24 hr (by default) for every unique 
> JDBC url provided.
> When client initiates CQSI connection, the connection initialization also 
> attempts to execute some metadata queries to ensure that the system tables 
> like SYSTEM.CATALOG exist and the client version is compatible against the 
> server version. For this, CQSI#init makes RPC calls against 
> MetaDataEndpointImpl coproc.
> This operation is valid for every CQSI connection initiated for every unique 
> JDBC url by every client. However, when server hosting SYSTEM.CATALOG 
> initiates CQSI connection, it means that SYSTEM.CATALOG and other system 
> tables already exist. Moreover, client/server version compatibility check is 
> not required because the connection is being created from the same server 
> that is hosting SYSTEM.CATALOG.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7375) CQSI connection initialization from SYSTEM.CATALOG regionserver does not require RPC calls to system tables

2024-07-27 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7375:
-

 Summary: CQSI connection initialization from SYSTEM.CATALOG 
regionserver does not require RPC calls to system tables
 Key: PHOENIX-7375
 URL: https://issues.apache.org/jira/browse/PHOENIX-7375
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.2.0
Reporter: Viraj Jasani


In order to execute any query at the server side, Phoenix client requires 
creating CQSI (ConnectionQueryServicesImpl) connection, which internally 
initiates and maintains long lasting connection against HBase server. CQSI 
connection is unique per JDBC url used to initiate the connection. Once 
created, CQSI connections are cached per 24 hr (by default) for every unique 
JDBC url provided.

When client initiates CQSI connection, the connection initialization also 
attempts to execute some metadata queries to ensure that the system tables like 
SYSTEM.CATALOG exist and the client version is compatible against the server 
version. For this, CQSI#init makes RPC calls against MetaDataEndpointImpl 
coproc.

This operation is valid for every CQSI connection initiated for every unique 
JDBC url by every client. However, when server hosting SYSTEM.CATALOG initiates 
CQSI connection, it means that SYSTEM.CATALOG and other system tables already 
exist. Moreover, client/server version compatibility check is not required 
because the connection is being created from the same server that is hosting 
SYSTEM.CATALOG.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7369) Avoid redundant recursive getTable() RPC calls

2024-07-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7369.
---
Resolution: Fixed

> Avoid redundant recursive getTable() RPC calls
> --
>
> Key: PHOENIX-7369
> URL: https://issues.apache.org/jira/browse/PHOENIX-7369
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> When PTable object is built for any given table as part of getTable() API, 
> many recursive getTable() calls are made in order to build the full PTable 
> object with all details of parent table/view hierarchy as well as index 
> tables.
> Moreover, PHOENIX-6247 introduced a way to separate Phoenix logical table 
> name from HBase physical table name. As part of this, the num of recursive 
> getTable() calls have increased but more importantly in case of view index, 
> the num of getTable() RPC calls using server-side CQSI connection has also 
> increased, which are redundant as the view index table does not have 
> corresponding PTable representation in Phoenix.
> {code:java}
>             } else if (Bytes.compareTo(LINK_TYPE_BYTES, 0, 
> LINK_TYPE_BYTES.length, colKv.getQualifierArray(), 
> colKv.getQualifierOffset(), colKv.getQualifierLength()) == 0) {
>                 LinkType linkType = 
> LinkType.fromSerializedValue(colKv.getValueArray()[colKv.getValueOffset()]);
>                 if (linkType == LinkType.INDEX_TABLE) {
>                     addIndexToTable(tenantId, schemaName, famName, tableName, 
> clientTimeStamp, indexes, clientVersion);
>                 } else if (linkType == PHYSICAL_TABLE) {
>                     // famName contains the logical name of the parent table. 
> We need to get the actual physical name of the table
>                     PTable parentTable = null;
>                     if (indexType != IndexType.LOCAL) {
>                         parentTable = getTable(null, 
> SchemaUtil.getSchemaNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>                                 
> SchemaUtil.getTableNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>  clientTimeStamp, clientVersion);
>                         if (parentTable == null || 
> isTableDeleted(parentTable)) {
>                             // parentTable is not in the cache. Since famName 
> is only logical name, we need to find the physical table.
>                             try (PhoenixConnection connection = 
> QueryUtil.getConnectionOnServer(env.getConfiguration()).unwrap(PhoenixConnection.class))
>  {
>                                 parentTable = 
> connection.getTableNoCache(famName.getString());
>                             } catch (TableNotFoundException e) {
>                                 // It is ok to swallow this exception since 
> this could be a view index and _IDX_ table is not there.
>                             }
>                         }
>                     } {code}
>  
> Under heavy load, the situation can get worse and occupy all metadata handler 
> threads, freezing the regionserver hosting SYSTEM.CATALOG.
> The proposal for this Jira:
>  * For View Index table, do not perform any getTable() call. This will also 
> avoid large num of RPC calls.
>  * Only for splittable SYSTEM.CATALOG, we should allow getTable() RPC calls 
> if the scanning of local region provides null PTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7370) Server to server system table RPC calls should use separate RPC handler pool

2024-07-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7370:
--
Description: 
HBase uses RPC (Remote Procedure Call) framework for all the wire communication 
among its components e.g. client to server (client to master daemon or client 
to regionservers) as well as server to server (master to regionserver, 
regionserver to regionserver) communication. HBase RPC uses Google's Protocol 
Buffers (protobuf) for defining the structure of messages sent between clients 
and servers. Protocol Buffers allow efficient serialization and deserialization 
of data, which is crucial for performance. HBase defines service interfaces 
using Protocol Buffers, which outline the operations that clients can request 
from HBase servers. These interfaces define methods like get, put, scan, etc., 
that clients use to interact with the database.

HBase also provides Coprocessors. HBase Coprocessors are used to extend the 
regionservers functionalities. They allow custom code to execute within the 
context of the regionserver during specific phases of the given workflow, such 
as during data reads (preScan, postScan etc), writes (preBatchMutate, 
postBatchMutate etc), region splits or even at the start or end of regionserver 
operations. In addition to being SQL query engine, Phoenix is also a 
Coprocessor component. RPC framework using Protobuf is used to define how 
coprocessor endpoints communicate between clients and the coprocessors running 
on the regionservers.

Phoenix client creates CQSI connection ({{{}ConnectionQueryServices{}}}), which 
maintains long time TCP connection with HBase server, usually knowns as 
{{HConnection}} or HBase Connection. Once the connection is created, it is 
cached by the Phoenix client.

While PHOENIX-6066 is considered the correct fix to improve the query 
performance, releasing it has surfaced other issues related to RPC framework. 
One of the issues surfaced caused deadlock for SYSTEM.CATALOG serving 
regionserver as it could not make any more progress because all handler threads 
serving RPC calls for Phoenix system tables (thread pool: 
{{{}RpcServer.Metadata.Fifo.handler{}}}) got exhausted while creating server 
side connection from the given regionserver.
Several workflows from MetaDataEndpointImpl coproc requires Phoenix connection, 
which is usually CQSI connection. Phoenix differentiates CQSI connections 
initiated by clients and servers by using a property: 
{{{}IS_SERVER_CONNECTION{}}}.
For CQSI connections created by servers, IS_SERVER_CONNECTION is kept true.
Under heavy load, when several clients execute getTable() calls for the same 
base table simultaneously, MetaDataEndpointImpl coproc attempts to create 
server side CQSI connection initially. As CQSI initialization also depends on 
Phoenix system tables existence check as well as client to server version 
compatibility checks, it also performs MetaDataEndpointImpl#getVersion() RPC 
call which is meant to be served by RpcServer.Metadata.Fifo.handler 
thread-pool. However, under heavy load, the thread-pool can be completely 
occupied if all getTable() calls tries to initiate CQSI connection, whereas 
only single thread can take global CQSI lock to initiate HBase Connection 
before caching CQSI connection for other threads to use. This has potential to 
create deadlock.
h3. Solutions:
 * Phoenix server to server system table RPC calls are supposed to be using 
separate handler thread-pools (PHOENIX-6687). However, this is not correctly 
working because regardless of whether the HBase Connection is initiated by 
client or server, Phoenix only provides ClientRpcControllerFactory by default. 
We need to provide separate RpcControllerFactory during HBase Connection 
initialization done by Coprocessors that operate on regionservers.
 * For Phoenix server creating CQSI connection, we do not need to check for 
existence of system tables as well as client-server version compatibility. This 
redundant RPC call can be avoided.

 

Doc on HBase/Phoenix RPC Scheduler Framework: 
https://docs.google.com/document/d/12SzcAY3mJVsN0naMnq45qsHcUIk1CzHsAI0EOi6IIgg/edit?usp=sharing

  was:
HBase uses RPC (Remote Procedure Call) framework for all the wire communication 
among its components e.g. client to server (client to master daemon or client 
to regionservers) as well as server to server (master to regionserver, 
regionserver to regionserver) communication. HBase RPC uses Google's Protocol 
Buffers (protobuf) for defining the structure of messages sent between clients 
and servers. Protocol Buffers allow efficient serialization and deserialization 
of data, which is crucial for performance. HBase defines service interfaces 
using Protocol Buffers, which outline the operations that clients can request 
from HBase servers. These interfaces define methods like get, put, scan, etc., 
that clients use to inter

[jira] [Updated] (PHOENIX-7369) Avoid redundant recursive getTable() RPC calls

2024-07-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7369:
--
Priority: Blocker  (was: Major)

> Avoid redundant recursive getTable() RPC calls
> --
>
> Key: PHOENIX-7369
> URL: https://issues.apache.org/jira/browse/PHOENIX-7369
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
>
> When PTable object is built for any given table as part of getTable() API, 
> many recursive getTable() calls are made in order to build the full PTable 
> object with all details of parent table/view hierarchy as well as index 
> tables.
> Moreover, PHOENIX-6247 introduced a way to separate Phoenix logical table 
> name from HBase physical table name. As part of this, the num of recursive 
> getTable() calls have increased but more importantly in case of view index, 
> the num of getTable() RPC calls using server-side CQSI connection has also 
> increased, which are redundant as the view index table does not have 
> corresponding PTable representation in Phoenix.
> {code:java}
>             } else if (Bytes.compareTo(LINK_TYPE_BYTES, 0, 
> LINK_TYPE_BYTES.length, colKv.getQualifierArray(), 
> colKv.getQualifierOffset(), colKv.getQualifierLength()) == 0) {
>                 LinkType linkType = 
> LinkType.fromSerializedValue(colKv.getValueArray()[colKv.getValueOffset()]);
>                 if (linkType == LinkType.INDEX_TABLE) {
>                     addIndexToTable(tenantId, schemaName, famName, tableName, 
> clientTimeStamp, indexes, clientVersion);
>                 } else if (linkType == PHYSICAL_TABLE) {
>                     // famName contains the logical name of the parent table. 
> We need to get the actual physical name of the table
>                     PTable parentTable = null;
>                     if (indexType != IndexType.LOCAL) {
>                         parentTable = getTable(null, 
> SchemaUtil.getSchemaNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>                                 
> SchemaUtil.getTableNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>  clientTimeStamp, clientVersion);
>                         if (parentTable == null || 
> isTableDeleted(parentTable)) {
>                             // parentTable is not in the cache. Since famName 
> is only logical name, we need to find the physical table.
>                             try (PhoenixConnection connection = 
> QueryUtil.getConnectionOnServer(env.getConfiguration()).unwrap(PhoenixConnection.class))
>  {
>                                 parentTable = 
> connection.getTableNoCache(famName.getString());
>                             } catch (TableNotFoundException e) {
>                                 // It is ok to swallow this exception since 
> this could be a view index and _IDX_ table is not there.
>                             }
>                         }
>                     } {code}
>  
> Under heavy load, the situation can get worse and occupy all metadata handler 
> threads, freezing the regionserver hosting SYSTEM.CATALOG.
> The proposal for this Jira:
>  * For View Index table, do not perform any getTable() call. This will also 
> avoid large num of RPC calls.
>  * Only for splittable SYSTEM.CATALOG, we should allow getTable() RPC calls 
> if the scanning of local region provides null PTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7369) Avoid redundant recursive getTable() RPC calls

2024-07-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7369:
--
Fix Version/s: 5.2.1
   5.3.0

> Avoid redundant recursive getTable() RPC calls
> --
>
> Key: PHOENIX-7369
> URL: https://issues.apache.org/jira/browse/PHOENIX-7369
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> When PTable object is built for any given table as part of getTable() API, 
> many recursive getTable() calls are made in order to build the full PTable 
> object with all details of parent table/view hierarchy as well as index 
> tables.
> Moreover, PHOENIX-6247 introduced a way to separate Phoenix logical table 
> name from HBase physical table name. As part of this, the num of recursive 
> getTable() calls have increased but more importantly in case of view index, 
> the num of getTable() RPC calls using server-side CQSI connection has also 
> increased, which are redundant as the view index table does not have 
> corresponding PTable representation in Phoenix.
> {code:java}
>             } else if (Bytes.compareTo(LINK_TYPE_BYTES, 0, 
> LINK_TYPE_BYTES.length, colKv.getQualifierArray(), 
> colKv.getQualifierOffset(), colKv.getQualifierLength()) == 0) {
>                 LinkType linkType = 
> LinkType.fromSerializedValue(colKv.getValueArray()[colKv.getValueOffset()]);
>                 if (linkType == LinkType.INDEX_TABLE) {
>                     addIndexToTable(tenantId, schemaName, famName, tableName, 
> clientTimeStamp, indexes, clientVersion);
>                 } else if (linkType == PHYSICAL_TABLE) {
>                     // famName contains the logical name of the parent table. 
> We need to get the actual physical name of the table
>                     PTable parentTable = null;
>                     if (indexType != IndexType.LOCAL) {
>                         parentTable = getTable(null, 
> SchemaUtil.getSchemaNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>                                 
> SchemaUtil.getTableNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>  clientTimeStamp, clientVersion);
>                         if (parentTable == null || 
> isTableDeleted(parentTable)) {
>                             // parentTable is not in the cache. Since famName 
> is only logical name, we need to find the physical table.
>                             try (PhoenixConnection connection = 
> QueryUtil.getConnectionOnServer(env.getConfiguration()).unwrap(PhoenixConnection.class))
>  {
>                                 parentTable = 
> connection.getTableNoCache(famName.getString());
>                             } catch (TableNotFoundException e) {
>                                 // It is ok to swallow this exception since 
> this could be a view index and _IDX_ table is not there.
>                             }
>                         }
>                     } {code}
>  
> Under heavy load, the situation can get worse and occupy all metadata handler 
> threads, freezing the regionserver hosting SYSTEM.CATALOG.
> The proposal for this Jira:
>  * For View Index table, do not perform any getTable() call. This will also 
> avoid large num of RPC calls.
>  * Only for splittable SYSTEM.CATALOG, we should allow getTable() RPC calls 
> if the scanning of local region provides null PTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7370) Server to server system table RPC calls should use separate RPC handler pool

2024-07-24 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7370:
-

 Summary: Server to server system table RPC calls should use 
separate RPC handler pool
 Key: PHOENIX-7370
 URL: https://issues.apache.org/jira/browse/PHOENIX-7370
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 5.2.0
Reporter: Viraj Jasani


HBase uses RPC (Remote Procedure Call) framework for all the wire communication 
among its components e.g. client to server (client to master daemon or client 
to regionservers) as well as server to server (master to regionserver, 
regionserver to regionserver) communication. HBase RPC uses Google's Protocol 
Buffers (protobuf) for defining the structure of messages sent between clients 
and servers. Protocol Buffers allow efficient serialization and deserialization 
of data, which is crucial for performance. HBase defines service interfaces 
using Protocol Buffers, which outline the operations that clients can request 
from HBase servers. These interfaces define methods like get, put, scan, etc., 
that clients use to interact with the database.

HBase also provides Coprocessors. HBase Coprocessors are used to extend the 
regionservers functionalities. They allow custom code to execute within the 
context of the regionserver during specific phases of the given workflow, such 
as during data reads (preScan, postScan etc), writes (preBatchMutate, 
postBatchMutate etc), region splits or even at the start or end of regionserver 
operations. In addition to being SQL query engine, Phoenix is also a 
Coprocessor component. RPC framework using Protobuf is used to define how 
coprocessor endpoints communicate between clients and the coprocessors running 
on the regionservers.

Phoenix client creates CQSI connection ({{{}ConnectionQueryServices{}}}), which 
maintains long time TCP connection with HBase server, usually knowns as 
{{HConnection}} or HBase Connection. Once the connection is created, it is 
cached by the Phoenix client.

While PHOENIX-6066 is considered the correct fix to improve the query 
performance, releasing it has surfaced other issues related to RPC framework. 
One of the issues surfaced caused deadlock for SYSTEM.CATALOG serving 
regionserver as it could not make any more progress because all handler threads 
serving RPC calls for Phoenix system tables (thread pool: 
{{{}RpcServer.Metadata.Fifo.handler{}}}) got exhausted while creating server 
side connection from the given regionserver.
Several workflows from MetaDataEndpointImpl coproc requires Phoenix connection, 
which is usually CQSI connection. Phoenix differentiates CQSI connections 
initiated by clients and servers by using a property: 
{{{}IS_SERVER_CONNECTION{}}}.
For CQSI connections created by servers, IS_SERVER_CONNECTION is kept true.
Under heavy load, when several clients execute getTable() calls for the same 
base table simultaneously, MetaDataEndpointImpl coproc attempts to create 
server side CQSI connection initially. As CQSI initialization also depends on 
Phoenix system tables existence check as well as client to server version 
compatibility checks, it also performs MetaDataEndpointImpl#getVersion() RPC 
call which is meant to be served by RpcServer.Metadata.Fifo.handler 
thread-pool. However, under heavy load, the thread-pool can be completely 
occupied if all getTable() calls tries to initiate CQSI connection, whereas 
only single thread can take global CQSI lock to initiate HBase Connection 
before caching CQSI connection for other threads to use. This has potential to 
create deadlock.
h3. Solutions:
 * Phoenix server to server system table RPC calls are supposed to be using 
separate handler thread-pools (PHOENIX-6687). However, this is not correctly 
working because regardless of whether the HBase Connection is initiated by 
client or server, Phoenix only provides ClientRpcControllerFactory by default. 
We need to provide separate RpcControllerFactory during HBase Connection 
initialization done by Coprocessors that operate on regionservers.
 * For Phoenix server creating CQSI connection, we do not need to check for 
existence of system tables as well as client-server version compatibility. This 
redundant RPC call can be avoided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7370) Server to server system table RPC calls should use separate RPC handler pool

2024-07-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7370:
-

Assignee: Viraj Jasani

> Server to server system table RPC calls should use separate RPC handler pool
> 
>
> Key: PHOENIX-7370
> URL: https://issues.apache.org/jira/browse/PHOENIX-7370
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> HBase uses RPC (Remote Procedure Call) framework for all the wire 
> communication among its components e.g. client to server (client to master 
> daemon or client to regionservers) as well as server to server (master to 
> regionserver, regionserver to regionserver) communication. HBase RPC uses 
> Google's Protocol Buffers (protobuf) for defining the structure of messages 
> sent between clients and servers. Protocol Buffers allow efficient 
> serialization and deserialization of data, which is crucial for performance. 
> HBase defines service interfaces using Protocol Buffers, which outline the 
> operations that clients can request from HBase servers. These interfaces 
> define methods like get, put, scan, etc., that clients use to interact with 
> the database.
> HBase also provides Coprocessors. HBase Coprocessors are used to extend the 
> regionservers functionalities. They allow custom code to execute within the 
> context of the regionserver during specific phases of the given workflow, 
> such as during data reads (preScan, postScan etc), writes (preBatchMutate, 
> postBatchMutate etc), region splits or even at the start or end of 
> regionserver operations. In addition to being SQL query engine, Phoenix is 
> also a Coprocessor component. RPC framework using Protobuf is used to define 
> how coprocessor endpoints communicate between clients and the coprocessors 
> running on the regionservers.
> Phoenix client creates CQSI connection ({{{}ConnectionQueryServices{}}}), 
> which maintains long time TCP connection with HBase server, usually knowns as 
> {{HConnection}} or HBase Connection. Once the connection is created, it is 
> cached by the Phoenix client.
> While PHOENIX-6066 is considered the correct fix to improve the query 
> performance, releasing it has surfaced other issues related to RPC framework. 
> One of the issues surfaced caused deadlock for SYSTEM.CATALOG serving 
> regionserver as it could not make any more progress because all handler 
> threads serving RPC calls for Phoenix system tables (thread pool: 
> {{{}RpcServer.Metadata.Fifo.handler{}}}) got exhausted while creating server 
> side connection from the given regionserver.
> Several workflows from MetaDataEndpointImpl coproc requires Phoenix 
> connection, which is usually CQSI connection. Phoenix differentiates CQSI 
> connections initiated by clients and servers by using a property: 
> {{{}IS_SERVER_CONNECTION{}}}.
> For CQSI connections created by servers, IS_SERVER_CONNECTION is kept true.
> Under heavy load, when several clients execute getTable() calls for the same 
> base table simultaneously, MetaDataEndpointImpl coproc attempts to create 
> server side CQSI connection initially. As CQSI initialization also depends on 
> Phoenix system tables existence check as well as client to server version 
> compatibility checks, it also performs MetaDataEndpointImpl#getVersion() RPC 
> call which is meant to be served by RpcServer.Metadata.Fifo.handler 
> thread-pool. However, under heavy load, the thread-pool can be completely 
> occupied if all getTable() calls tries to initiate CQSI connection, whereas 
> only single thread can take global CQSI lock to initiate HBase Connection 
> before caching CQSI connection for other threads to use. This has potential 
> to create deadlock.
> h3. Solutions:
>  * Phoenix server to server system table RPC calls are supposed to be using 
> separate handler thread-pools (PHOENIX-6687). However, this is not correctly 
> working because regardless of whether the HBase Connection is initiated by 
> client or server, Phoenix only provides ClientRpcControllerFactory by 
> default. We need to provide separate RpcControllerFactory during HBase 
> Connection initialization done by Coprocessors that operate on regionservers.
>  * For Phoenix server creating CQSI connection, we do not need to check for 
> existence of system tables as well as client-server version compatibility. 
> This redundant RPC call can be avoided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7369) Avoid redundant recursive getTable() RPC calls

2024-07-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7369:
-

Assignee: Viraj Jasani

> Avoid redundant recursive getTable() RPC calls
> --
>
> Key: PHOENIX-7369
> URL: https://issues.apache.org/jira/browse/PHOENIX-7369
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> When PTable object is built for any given table as part of getTable() API, 
> many recursive getTable() calls are made in order to build the full PTable 
> object with all details of parent table/view hierarchy as well as index 
> tables.
> Moreover, PHOENIX-6247 introduced a way to separate Phoenix logical table 
> name from HBase physical table name. As part of this, the num of recursive 
> getTable() calls have increased but more importantly in case of view index, 
> the num of getTable() RPC calls using server-side CQSI connection has also 
> increased, which are redundant as the view index table does not have 
> corresponding PTable representation in Phoenix.
> {code:java}
>             } else if (Bytes.compareTo(LINK_TYPE_BYTES, 0, 
> LINK_TYPE_BYTES.length, colKv.getQualifierArray(), 
> colKv.getQualifierOffset(), colKv.getQualifierLength()) == 0) {
>                 LinkType linkType = 
> LinkType.fromSerializedValue(colKv.getValueArray()[colKv.getValueOffset()]);
>                 if (linkType == LinkType.INDEX_TABLE) {
>                     addIndexToTable(tenantId, schemaName, famName, tableName, 
> clientTimeStamp, indexes, clientVersion);
>                 } else if (linkType == PHYSICAL_TABLE) {
>                     // famName contains the logical name of the parent table. 
> We need to get the actual physical name of the table
>                     PTable parentTable = null;
>                     if (indexType != IndexType.LOCAL) {
>                         parentTable = getTable(null, 
> SchemaUtil.getSchemaNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>                                 
> SchemaUtil.getTableNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
>  clientTimeStamp, clientVersion);
>                         if (parentTable == null || 
> isTableDeleted(parentTable)) {
>                             // parentTable is not in the cache. Since famName 
> is only logical name, we need to find the physical table.
>                             try (PhoenixConnection connection = 
> QueryUtil.getConnectionOnServer(env.getConfiguration()).unwrap(PhoenixConnection.class))
>  {
>                                 parentTable = 
> connection.getTableNoCache(famName.getString());
>                             } catch (TableNotFoundException e) {
>                                 // It is ok to swallow this exception since 
> this could be a view index and _IDX_ table is not there.
>                             }
>                         }
>                     } {code}
>  
> Under heavy load, the situation can get worse and occupy all metadata handler 
> threads, freezing the regionserver hosting SYSTEM.CATALOG.
> The proposal for this Jira:
>  * For View Index table, do not perform any getTable() call. This will also 
> avoid large num of RPC calls.
>  * Only for splittable SYSTEM.CATALOG, we should allow getTable() RPC calls 
> if the scanning of local region provides null PTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7369) Avoid redundant recursive getTable() RPC calls

2024-07-24 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7369:
-

 Summary: Avoid redundant recursive getTable() RPC calls
 Key: PHOENIX-7369
 URL: https://issues.apache.org/jira/browse/PHOENIX-7369
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 5.2.0
Reporter: Viraj Jasani


When PTable object is built for any given table as part of getTable() API, many 
recursive getTable() calls are made in order to build the full PTable object 
with all details of parent table/view hierarchy as well as index tables.

Moreover, PHOENIX-6247 introduced a way to separate Phoenix logical table name 
from HBase physical table name. As part of this, the num of recursive 
getTable() calls have increased but more importantly in case of view index, the 
num of getTable() RPC calls using server-side CQSI connection has also 
increased, which are redundant as the view index table does not have 
corresponding PTable representation in Phoenix.
{code:java}
            } else if (Bytes.compareTo(LINK_TYPE_BYTES, 0, 
LINK_TYPE_BYTES.length, colKv.getQualifierArray(), colKv.getQualifierOffset(), 
colKv.getQualifierLength()) == 0) {
                LinkType linkType = 
LinkType.fromSerializedValue(colKv.getValueArray()[colKv.getValueOffset()]);
                if (linkType == LinkType.INDEX_TABLE) {
                    addIndexToTable(tenantId, schemaName, famName, tableName, 
clientTimeStamp, indexes, clientVersion);
                } else if (linkType == PHYSICAL_TABLE) {
                    // famName contains the logical name of the parent table. 
We need to get the actual physical name of the table
                    PTable parentTable = null;
                    if (indexType != IndexType.LOCAL) {
                        parentTable = getTable(null, 
SchemaUtil.getSchemaNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
                                
SchemaUtil.getTableNameFromFullName(famName.getBytes()).getBytes(StandardCharsets.UTF_8),
 clientTimeStamp, clientVersion);
                        if (parentTable == null || isTableDeleted(parentTable)) 
{
                            // parentTable is not in the cache. Since famName 
is only logical name, we need to find the physical table.
                            try (PhoenixConnection connection = 
QueryUtil.getConnectionOnServer(env.getConfiguration()).unwrap(PhoenixConnection.class))
 {
                                parentTable = 
connection.getTableNoCache(famName.getString());
                            } catch (TableNotFoundException e) {
                                // It is ok to swallow this exception since 
this could be a view index and _IDX_ table is not there.
                            }
                        }
                    } {code}
 

Under heavy load, the situation can get worse and occupy all metadata handler 
threads, freezing the regionserver hosting SYSTEM.CATALOG.

The proposal for this Jira:
 * For View Index table, do not perform any getTable() call. This will also 
avoid large num of RPC calls.
 * Only for splittable SYSTEM.CATALOG, we should allow getTable() RPC calls if 
the scanning of local region provides null PTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7363) Protect server side metadata cache updates for the given PTable

2024-07-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Summary: Protect server side metadata cache updates for the given PTable  
(was: Protect metadata cache updates for the given PTable)

> Protect server side metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.

[jira] [Updated] (PHOENIX-7363) Protect metadata cache updates for the given PTable

2024-07-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Summary: Protect metadata cache updates for the given PTable  (was: All 
meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init)

> Protect metadata cache updates for the given PTable
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> *All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServe

[jira] [Updated] (PHOENIX-7363) All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init

2024-07-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Description: 
*All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init*

After allowing readLock for getTable API at the server side (PHOENIX-6066), it 
seems that under heavy load, all meta handler threads can get exhausted within 
ConnectionQueryServicesImpl initialization as part of any of the 
MetaDataEndpointImpl coproc operations. When the table details are not present 
in the cache, MetaDataEndpointImpl coproc can attempt to create new connection 
on the server side in order to scan SYSTEM.CATALOG. Under heavy load, several 
(all of) meta handlers – which are dedicated for all metadata (system table) 
operations – could attempt to create server side connection, which can further 
lead into creating new PhoenixConnection to execute CREATE TABLE DDL for 
SYSTEM.CATALOG in order to ensure that the System tables exist.
{code:java}
"RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
[0x7fd4b1234000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
    - waiting to lock <0x00047da00058> (a 
org.apache.phoenix.query.ConnectionQueryServicesImpl)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
    at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
    at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
    at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
    at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:208)
    at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
    at 
org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
    at 
org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
    at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
    at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
    at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
    at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
    at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) {code}
{code:java}
"RpcServer.Metadata.Fifo.handler=142,queue=12,port=60020" #815 daemon prio=5 
os_prio=0 tid=0x7fd53f07f000 nid=0x403 waiting on condition 
[0x7fd4b8234000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00047d400640> (a 
java.util.concurrent.FutureTask)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Fut

[jira] [Updated] (PHOENIX-7363) All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init

2024-07-22 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Description: 
After allowing readLock for getTable API at the server side (PHOENIX-6066), it 
seems that under heavy load, all meta handler threads can get exhausted within 
ConnectionQueryServicesImpl initialization as part of any of the 
MetaDataEndpointImpl coproc operations. When the table details are not present 
in the cache, MetaDataEndpointImpl coproc can attempt to create new connection 
on the server side in order to scan SYSTEM.CATALOG. Under heavy load, several 
(all of) meta handlers – which are dedicated for all metadata (system table) 
operations – could attempt to create server side connection, which can further 
lead into creating new PhoenixConnection to execute CREATE TABLE DDL for 
SYSTEM.CATALOG in order to ensure that the System tables exist.
{code:java}
"RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
[0x7fd4b1234000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
    - waiting to lock <0x00047da00058> (a 
org.apache.phoenix.query.ConnectionQueryServicesImpl)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
    at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
    at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
    at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
    at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:208)
    at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
    at 
org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
    at 
org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
    at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
    at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
    at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
    at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
    at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) {code}
{code:java}
"RpcServer.Metadata.Fifo.handler=142,queue=12,port=60020" #815 daemon prio=5 
os_prio=0 tid=0x7fd53f07f000 nid=0x403 waiting on condition 
[0x7fd4b8234000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00047d400640> (a 
java.util.concurrent.FutureTask)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
    at java.util.concurrent.FutureTask.g

[jira] [Updated] (PHOENIX-7365) ExplainPlanV2 should get trimmed list for regionserver location

2024-07-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7365:
--
Fix Version/s: 5.1.4

> ExplainPlanV2 should get trimmed list for regionserver location
> ---
>
> Key: PHOENIX-7365
> URL: https://issues.apache.org/jira/browse/PHOENIX-7365
> Project: Phoenix
>  Issue Type: Task
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> PHOENIX-6907 introduces way for ExplainPlan to provide list of regionserver 
> locations that are accessed by the query during execution. It also introduces 
> config "phoenix.max.region.locations.size.explain.plan" to restrict max 
> region locations to be displayed as part of the Explain plan output. However, 
> the trimmed list was intentionally meant for query output version of 
> ExplainPlan only, not for ExplainPlanV2 based explain plan output.
> However, given the convenience provided by ExplainPlanV2 objects, it is far 
> easier to use it by client applications rather than breaking down the String 
> based old explain plan. It would be better to keep trimmed list of 
> regionserver for ExplainPlanV2 as well, rather than keeping full list around 
> in the objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7365) ExplainPlanV2 should get trimmed list for regionserver location

2024-07-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7365.
---
Resolution: Fixed

> ExplainPlanV2 should get trimmed list for regionserver location
> ---
>
> Key: PHOENIX-7365
> URL: https://issues.apache.org/jira/browse/PHOENIX-7365
> Project: Phoenix
>  Issue Type: Task
>Affects Versions: 5.2.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0, 5.1.4
>
>
> PHOENIX-6907 introduces way for ExplainPlan to provide list of regionserver 
> locations that are accessed by the query during execution. It also introduces 
> config "phoenix.max.region.locations.size.explain.plan" to restrict max 
> region locations to be displayed as part of the Explain plan output. However, 
> the trimmed list was intentionally meant for query output version of 
> ExplainPlan only, not for ExplainPlanV2 based explain plan output.
> However, given the convenience provided by ExplainPlanV2 objects, it is far 
> easier to use it by client applications rather than breaking down the String 
> based old explain plan. It would be better to keep trimmed list of 
> regionserver for ExplainPlanV2 as well, rather than keeping full list around 
> in the objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7233) CQSI openConnection should timeout to unblock other connection threads

2024-07-18 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7233.
---
Resolution: Won't Fix

With HBASE-28428 resolved, we should no longer need PHOENIX-7233.

> CQSI openConnection should timeout to unblock other connection threads
> --
>
> Key: PHOENIX-7233
> URL: https://issues.apache.org/jira/browse/PHOENIX-7233
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.1.3
>Reporter: Viraj Jasani
>Priority: Major
>
> PhoenixDriver initializes and caches ConnectionQueryServices objects with 
> connectionQueryServicesCache. As part of the CQSI initialization, connection 
> is opened with HBase server by using HBase client provided ConnectionFactory, 
> which provides Connection object to the client. The Connection object 
> provided by HBase allows clients to share Zookeeper connection, meta cache as 
> well as remote connections to regionservers and master daemons. The 
> Connection object is used to perform Table CRUD operations as well as 
> Administrative actions on the cluster.
> HBase Connection object initialization requires ClusterId, which is 
> maintained either in Zookeeper or Master daemons (or both) and retrieved by 
> client depending on whether the client is configured to use 
> ZKConnectionRegistry or MasterRegistry/RpcConnectionRegistry.
> For ZKConnectionRegistry, we have run into an edge case wherein the 
> connection to Zookeeper server got stuck for more than 12 hours. When the 
> client tried to create connection to Zookeeper quorum to retrieve the 
> ClusterId, Zookeeper leader was switched from one server to another. While 
> the leader switch event resulting into stuck connection requires RCA, it is 
> not appropriate for Phoenix/HBase client to indefinitely wait for the 
> response from Zookeeper without any connection timeout.
> For Phoenix client, if one thread is stuck in opening connection during 
> CQSI#init, all other threads trying to create connections would get stuck 
> because we take class level lock before opening the connection, leading to 
> all threads getting stuck and potential termination or degradation of the 
> client JVM.
> While HBase client should also use timeout, however not having timeout from 
> Phoenix client side has far worse complications. As part of this Jira, we 
> should introduce a way for CQSI#openConnection to timeout, either by using 
> CompletableFuture API or using our preconfigured thread-pool.
>  
> Stacktrace for reference:
>  
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
> org.apache.hadoop.hbase.client.ConnectionImplementation.
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
> java.lang.reflect.Constructor.newInstance
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
> org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
> java.security.AccessController.doPrivileged
> javax.security.auth.Subject.doAs
> org.apache.hadoop.security.UserGroupInformation.doAs
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$?
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.util.PhoenixContextExecutor.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices
> org.apache.phoenix.jdbc.HighAvailabilityGroup.connectToOneCluster
> org.apache.phoenix.jdbc.ParallelPhoenixConnection.getConnection
> org.apache.phoenix.jdbc.ParallelPhoenixConnection.lambda$new$?
> org.apache.phoenix.jdbc.ParallelPhoenixConnection$$Lambda$?.get
> org.apache.phoenix.jdbc.ParallelPhoenixContext.lambda$chainOnConnClusterContext$?
> org.apache.phoenix.jdbc.ParallelPhoenixContext$$Lambda$?.apply {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7233) CQSI openConnection should timeout to unblock other connection threads

2024-07-18 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7233:
-

Assignee: (was: Divneet Kaur)

> CQSI openConnection should timeout to unblock other connection threads
> --
>
> Key: PHOENIX-7233
> URL: https://issues.apache.org/jira/browse/PHOENIX-7233
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.1.3
>Reporter: Viraj Jasani
>Priority: Major
>
> PhoenixDriver initializes and caches ConnectionQueryServices objects with 
> connectionQueryServicesCache. As part of the CQSI initialization, connection 
> is opened with HBase server by using HBase client provided ConnectionFactory, 
> which provides Connection object to the client. The Connection object 
> provided by HBase allows clients to share Zookeeper connection, meta cache as 
> well as remote connections to regionservers and master daemons. The 
> Connection object is used to perform Table CRUD operations as well as 
> Administrative actions on the cluster.
> HBase Connection object initialization requires ClusterId, which is 
> maintained either in Zookeeper or Master daemons (or both) and retrieved by 
> client depending on whether the client is configured to use 
> ZKConnectionRegistry or MasterRegistry/RpcConnectionRegistry.
> For ZKConnectionRegistry, we have run into an edge case wherein the 
> connection to Zookeeper server got stuck for more than 12 hours. When the 
> client tried to create connection to Zookeeper quorum to retrieve the 
> ClusterId, Zookeeper leader was switched from one server to another. While 
> the leader switch event resulting into stuck connection requires RCA, it is 
> not appropriate for Phoenix/HBase client to indefinitely wait for the 
> response from Zookeeper without any connection timeout.
> For Phoenix client, if one thread is stuck in opening connection during 
> CQSI#init, all other threads trying to create connections would get stuck 
> because we take class level lock before opening the connection, leading to 
> all threads getting stuck and potential termination or degradation of the 
> client JVM.
> While HBase client should also use timeout, however not having timeout from 
> Phoenix client side has far worse complications. As part of this Jira, we 
> should introduce a way for CQSI#openConnection to timeout, either by using 
> CompletableFuture API or using our preconfigured thread-pool.
>  
> Stacktrace for reference:
>  
> {code:java}
> jdk.internal.misc.Unsafe.park
> java.util.concurrent.locks.LockSupport.park
> java.util.concurrent.CompletableFuture$Signaller.block
> java.util.concurrent.ForkJoinPool.managedBlock
> java.util.concurrent.CompletableFuture.waitingGet
> java.util.concurrent.CompletableFuture.get
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId
> org.apache.hadoop.hbase.client.ConnectionImplementation.
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance?
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance
> java.lang.reflect.Constructor.newInstance
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$?
> org.apache.hadoop.hbase.client.ConnectionFactory$$Lambda$?.run
> java.security.AccessController.doPrivileged
> javax.security.auth.Subject.doAs
> org.apache.hadoop.security.UserGroupInformation.doAs
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection
> org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$?
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl$?.call
> org.apache.phoenix.util.PhoenixContextExecutor.call
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices
> org.apache.phoenix.jdbc.HighAvailabilityGroup.connectToOneCluster
> org.apache.phoenix.jdbc.ParallelPhoenixConnection.getConnection
> org.apache.phoenix.jdbc.ParallelPhoenixConnection.lambda$new$?
> org.apache.phoenix.jdbc.ParallelPhoenixConnection$$Lambda$?.get
> org.apache.phoenix.jdbc.ParallelPhoenixContext.lambda$chainOnConnClusterContext$?
> org.apache.phoenix.jdbc.ParallelPhoenixContext$$Lambda$?.apply {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7365) ExplainPlanV2 should get trimmed list for regionserver location

2024-07-18 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7365:
-

 Summary: ExplainPlanV2 should get trimmed list for regionserver 
location
 Key: PHOENIX-7365
 URL: https://issues.apache.org/jira/browse/PHOENIX-7365
 Project: Phoenix
  Issue Type: Task
Affects Versions: 5.2.0
Reporter: Viraj Jasani
Assignee: Viraj Jasani
 Fix For: 5.2.1, 5.3.0


PHOENIX-6907 introduces way for ExplainPlan to provide list of regionserver 
locations that are accessed by the query during execution. It also introduces 
config "phoenix.max.region.locations.size.explain.plan" to restrict max region 
locations to be displayed as part of the Explain plan output. However, the 
trimmed list was intentionally meant for query output version of ExplainPlan 
only, not for ExplainPlanV2 based explain plan output.

However, given the convenience provided by ExplainPlanV2 objects, it is far 
easier to use it by client applications rather than breaking down the String 
based old explain plan. It would be better to keep trimmed list of regionserver 
for ExplainPlanV2 as well, rather than keeping full list around in the objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-6066) MetaDataEndpointImpl.doGetTable should acquire a readLock instead of an exclusive writeLock on the table header row

2024-07-17 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-6066.
---
Resolution: Fixed

> MetaDataEndpointImpl.doGetTable should acquire a readLock instead of an 
> exclusive writeLock on the table header row
> ---
>
> Key: PHOENIX-6066
> URL: https://issues.apache.org/jira/browse/PHOENIX-6066
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 5.0.0, 4.15.0
>Reporter: Chinmay Kulkarni
>Assignee: Palash Chauhan
>Priority: Major
>  Labels: quality-improvement
> Fix For: 5.2.1, 5.3.0
>
>
> Throughout MetaDataEndpointImpl, wherever we need to acquire a row lock we 
> call 
> [MetaDataEndpointImpl.acquireLock|https://github.com/apache/phoenix/blob/bba7d59f81f2b91342fa5a7ee213170739573d6a/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2377-L2386]
>  which gets an exclusive writeLock on the specified row [by 
> default|https://github.com/apache/phoenix/blob/bba7d59f81f2b91342fa5a7ee213170739573d6a/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2378].
> Thus, even operations like doGetTable/getSchema/getFunctions which are not 
> modifying the row will acquire a writeLock on these metadata rows when a 
> readLock should be sufficient (see [doGetTable 
> locking|https://github.com/apache/phoenix/blob/bba7d59f81f2b91342fa5a7ee213170739573d6a/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2932]
>  as an example). The problem with this is, even a simple UPSERT/DELETE or 
> SELECT query triggers a doGetTable (if the schema is not cached) and can 
> potentially block other DDLs and more importantly other queries since these 
> queries will wait until they can get a rowLock for the table header row. Even 
> seemingly unrelated operations like a CREATE VIEW AS SELECT * FROM T can 
> block a SELECT/UPSERT/DELETE on table T since the create view code needs to 
> fetch the schema of the parent table.
> Note that this is exacerbated in cases where we do server-server RPCs while 
> holding rowLocks for example 
> ([this|https://github.com/apache/phoenix/blob/1d844950bb4ec8221873ecd2b094c20f427cd984/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2459-L2461]
>  and 
> [this|https://github.com/apache/phoenix/blob/1d844950bb4ec8221873ecd2b094c20f427cd984/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2479-L2484])
>  which is another issue altogether.
> This Jira is to discuss the possibility of acquiring a readLock in these 
> "read metadata" paths to avoid blocking other "read metadata" requests 
> stemming from concurrent queries. The current behavior is potentially a perf 
> issue for clients that disable update-cache-frequency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7363) All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init

2024-07-17 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Priority: Blocker  (was: Major)

> All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 5.2.1, 5.3.0
>
>
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at org.apache.hadoop.hbase.ipc.RpcHandle

[jira] [Assigned] (PHOENIX-7363) All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init

2024-07-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7363:
-

Assignee: Viraj Jasani

> All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at org.apache.hadoop.hbase.ipc.RpcHandler.ru

[jira] [Created] (PHOENIX-7363) All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init

2024-07-16 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7363:
-

 Summary: All meta handlers on SYSTEM.CATALOG regionserver 
exhausted within CQSI#init
 Key: PHOENIX-7363
 URL: https://issues.apache.org/jira/browse/PHOENIX-7363
 Project: Phoenix
  Issue Type: Bug
Reporter: Viraj Jasani


After allowing readLock for getTable API at the server side (PHOENIX-6066), it 
seems that under heavy load, all meta handler threads can get exhausted within 
ConnectionQueryServicesImpl initialization as part of any of the 
MetaDataEndpointImpl coproc operations. When the table details are not present 
in the cache, MetaDataEndpointImpl coproc can attempt to create new connection 
on the server side in order to scan SYSTEM.CATALOG. Under heavy load, several 
(all of) meta handlers – which are dedicated for all metadata (system table) 
operations – could attempt to create server side connection, which can further 
lead into creating new PhoenixConnection to execute CREATE TABLE DDL for 
SYSTEM.CATALOG in order to ensure that the System tables exist.
{code:java}
"RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
[0x7fd4b1234000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
    - waiting to lock <0x00047da00058> (a 
org.apache.phoenix.query.ConnectionQueryServicesImpl)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
    at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
    at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
    at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
    at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
    at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:208)
    at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
    at 
org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
    at 
org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
    at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
    at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
    at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
    at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
    at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
    at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
    at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) {code}
{code:java}
"RpcServer.Metadata.Fifo.handler=142,queue=12,port=60020" #815 daemon prio=5 
os_prio=0 tid=0x7fd53f07f000 nid=0x403 waiting on condition 
[0x7fd4b8234000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00047d400640> (a 
java.util.concurrent.FutureTask)
    at java.util.concurrent.

[jira] [Updated] (PHOENIX-7363) All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init

2024-07-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7363:
--
Fix Version/s: 5.2.1
   5.3.0

> All meta handlers on SYSTEM.CATALOG regionserver exhausted within CQSI#init
> ---
>
> Key: PHOENIX-7363
> URL: https://issues.apache.org/jira/browse/PHOENIX-7363
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> After allowing readLock for getTable API at the server side (PHOENIX-6066), 
> it seems that under heavy load, all meta handler threads can get exhausted 
> within ConnectionQueryServicesImpl initialization as part of any of the 
> MetaDataEndpointImpl coproc operations. When the table details are not 
> present in the cache, MetaDataEndpointImpl coproc can attempt to create new 
> connection on the server side in order to scan SYSTEM.CATALOG. Under heavy 
> load, several (all of) meta handlers – which are dedicated for all metadata 
> (system table) operations – could attempt to create server side connection, 
> which can further lead into creating new PhoenixConnection to execute CREATE 
> TABLE DDL for SYSTEM.CATALOG in order to ensure that the System tables exist.
> {code:java}
> "RpcServer.Metadata.Fifo.handler=254,queue=20,port=60020" #927 daemon prio=5 
> os_prio=0 tid=0x7fd53f16a000 nid=0x473 waiting for monitor entry 
> [0x7fd4b1234000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3547)
>     - waiting to lock <0x00047da00058> (a 
> org.apache.phoenix.query.ConnectionQueryServicesImpl)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>     at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:3537)
>     at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:272)
>     at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
>     at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:229)
>     at java.sql.DriverManager.getConnection(DriverManager.java:664)
>     at java.sql.DriverManager.getConnection(DriverManager.java:208)
>     at org.apache.phoenix.util.QueryUtil.getConnection(QueryUtil.java:433)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:410)
>     at 
> org.apache.phoenix.util.QueryUtil.getConnectionOnServer(QueryUtil.java:391)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1499)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.addIndexToTable(MetaDataEndpointImpl.java:832)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1490)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTableFromCells(MetaDataEndpointImpl.java:1075)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1069)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:737)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:3599)
>     at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:669)
>     at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:19507)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7941)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2537)
>     at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2511)
>     at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45035)
>     at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)
>     at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
>     at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:10

[jira] [Updated] (PHOENIX-6978) Redesign Phoenix TTL for Views

2024-07-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-6978:
--
Fix Version/s: 5.3.0

> Redesign Phoenix TTL for Views
> --
>
> Key: PHOENIX-6978
> URL: https://issues.apache.org/jira/browse/PHOENIX-6978
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Jacob Isaac
>Assignee: Jacob Isaac
>Priority: Major
> Fix For: 5.3.0
>
>
> With Phoenix TTL for views (PHOENIX-3725), the basic gist was the TTL should 
> be a Phoenix view level setting instead of being at the table level as 
> implemented in HBase. More details on the old design are here ([Phoenix TTL 
> old design 
> doc|https://docs.google.com/document/d/1aZWhJQCARBVt9VIXNgINCB8O0fk2GucxXeu7472SVL8/edit#heading=h.kpf13qig3vdl]).
> Both HBase TTL and Phoenix TTL rely on applying expiration logic during the 
> scanning phase when serving query results and apply deletion logic when 
> pruning the rows from the store. In HBase, the pruning is achieved during the 
> compaction phase.
> The initial design and implementation of Phoenix TTL for views used the MR 
> framework to run delete jobs to prune away the expired rows. We knew this was 
> a sub-optimal solution since it required managing and monitoring MR jobs. It 
> would also have introduced additional delete markers which would have 
> temporarily added more rows (delete markers) have made the scans less 
> performant.
> Using the HBase compaction framework instead to prune away the expired rows 
> would fit nicely into the existing architecture and would be efficient like 
> pruning the HBase TTL rows. 
> This jira proposes a redesign of Phoenix TTL for Views using PHOENIX-6888 and 
> PHOENIX-4555
> [New Design 
> doc|https://docs.google.com/document/d/1D2B0G_sVe9eE66bk-sxUfSgoGtQCvD7xBZRxZz-Q1TM/edit]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-6978) Redesign Phoenix TTL for Views

2024-07-16 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-6978.
---
Resolution: Fixed

> Redesign Phoenix TTL for Views
> --
>
> Key: PHOENIX-6978
> URL: https://issues.apache.org/jira/browse/PHOENIX-6978
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Jacob Isaac
>Assignee: Jacob Isaac
>Priority: Major
> Fix For: 5.3.0
>
>
> With Phoenix TTL for views (PHOENIX-3725), the basic gist was the TTL should 
> be a Phoenix view level setting instead of being at the table level as 
> implemented in HBase. More details on the old design are here ([Phoenix TTL 
> old design 
> doc|https://docs.google.com/document/d/1aZWhJQCARBVt9VIXNgINCB8O0fk2GucxXeu7472SVL8/edit#heading=h.kpf13qig3vdl]).
> Both HBase TTL and Phoenix TTL rely on applying expiration logic during the 
> scanning phase when serving query results and apply deletion logic when 
> pruning the rows from the store. In HBase, the pruning is achieved during the 
> compaction phase.
> The initial design and implementation of Phoenix TTL for views used the MR 
> framework to run delete jobs to prune away the expired rows. We knew this was 
> a sub-optimal solution since it required managing and monitoring MR jobs. It 
> would also have introduced additional delete markers which would have 
> temporarily added more rows (delete markers) have made the scans less 
> performant.
> Using the HBase compaction framework instead to prune away the expired rows 
> would fit nicely into the existing architecture and would be efficient like 
> pruning the HBase TTL rows. 
> This jira proposes a redesign of Phoenix TTL for Views using PHOENIX-6888 and 
> PHOENIX-4555
> [New Design 
> doc|https://docs.google.com/document/d/1D2B0G_sVe9eE66bk-sxUfSgoGtQCvD7xBZRxZz-Q1TM/edit]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (PHOENIX-7357) New variable length binary data type: VARBINARY_ENCODED

2024-07-11 Thread Viraj Jasani (Jira)
Viraj Jasani created PHOENIX-7357:
-

 Summary: New variable length binary data type: VARBINARY_ENCODED
 Key: PHOENIX-7357
 URL: https://issues.apache.org/jira/browse/PHOENIX-7357
 Project: Phoenix
  Issue Type: New Feature
Reporter: Viraj Jasani
Assignee: Viraj Jasani
 Fix For: 5.3.0


As of today, Phoenix provides several variable length as well as fixed length 
data types. One of the variable length data types is VARBINARY. It is variable 
length binary blob. Using VARBINARY as only primary key can be considered as if 
using HBase row key.

HBase provides a single row key. Any client application that requires using 
more than one column for primary keys, using HBase requires special handling of 
storing both column values as a single binary row key. Phoenix provides the 
ability to use more than one primary key by providing composite primary keys. 
Composite primary key can contain any number of primary key columns. Phoenix 
also provides the ability to add new nullable primary key columns to the 
existing composite primary keys. Phoenix uses HBase as its backing store. In 
order to provide the ability for users to define multiple primary keys, Phoenix 
internally concatenates binary encoded values of each primary key column value 
and uses concatenated binary value as HBase row key. In order to efficiently 
concatenate as well as retrieve individual primary key values, Phoenix 
implements two ways:
 # For fixed length columns: The length of the given column is determined by 
the maximum length of the column. As part of the read flow, while iterating 
through the row key, fixed length numbers of bytes are retrieved while reading. 
While writing, if the original encoded value of the given column has less 
number of bytes, additional null bytes (\x00) are padded until the fixed length 
is filled up. Hence, for smaller values, we end up wasting some space.
 # For variable length columns: Since we cannot know the length of the value of 
variable length data type in advance, a separator or terminator byte is used. 
Phoenix uses null byte as separator (\x00) byte. As of today, VARCHAR is the 
most commonly used variable length data type and since VARCHAR represents 
String, null byte is not part of valid String characters. Hence, it can be 
effectively used to determine when to terminate the given VARCHAR value.

 

The null byte (\x00) works fine as a separator for VARCHAR. However, it cannot 
be used as a separator byte for VARBINARY because VARBINARY can contain any 
binary blob values. Due to this, Phoenix has restrictions for VARBINARY type: 

 
 # It can only be used as the last part of the composite primary key.
 # It cannot be used as a DESC order primary key column.

 

Using VARBINARY data type as an earlier portion of the composite primary key is 
a valid use case. One can also use multiple VARBINARY primary key columns. 
After all, Phoenix provides the ability to use multiple primary key columns for 
users.

Besides, using secondary index on data table means that the composite primary 
key of secondary index table includes: 

  …  
  … 

 

As primary key columns are appended to the secondary indexes columns, one 
cannot create a secondary index on any VARBINARY column.

The proposal of this Jira is to introduce new data type 
{*}VARBINARY_ENCODED{*}, which has no restriction of being considered as 
composite primary key prefix or using it as DESC ordered column.

This means, we need to effectively distinguish where the variable length binary 
data terminates in the absence of fixed length information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7348) Default INCLUDE scopes given in CREATE CDC are not getting recognized

2024-07-02 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7348.
---
Resolution: Fixed

> Default INCLUDE scopes given in CREATE CDC are not getting recognized
> -
>
> Key: PHOENIX-7348
> URL: https://issues.apache.org/jira/browse/PHOENIX-7348
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Hari Krishna Dara
>Assignee: Hari Krishna Dara
>Priority: Minor
> Fix For: 5.3.0
>
>
> The CREATE CDC statement allows specifying a default for the change image 
> scopes which should get used when there is no query hint, but this value is 
> not getting used. There is also no test to catch this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7348) Default INCLUDE scopes given in CREATE CDC are not getting recognized

2024-07-02 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7348:
--
Fix Version/s: 5.3.0

> Default INCLUDE scopes given in CREATE CDC are not getting recognized
> -
>
> Key: PHOENIX-7348
> URL: https://issues.apache.org/jira/browse/PHOENIX-7348
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Hari Krishna Dara
>Assignee: Hari Krishna Dara
>Priority: Minor
> Fix For: 5.3.0
>
>
> The CREATE CDC statement allows specifying a default for the change image 
> scopes which should get used when there is no query hint, but this value is 
> not getting used. There is also no test to catch this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-6714) Return update status from Conditional Upserts

2024-07-02 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-6714.
---
Resolution: Fixed

> Return update status from Conditional Upserts
> -
>
> Key: PHOENIX-6714
> URL: https://issues.apache.org/jira/browse/PHOENIX-6714
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Tanuj Khurana
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> {code:java}
> 0: jdbc:phoenix:localhost> upsert into T1 values ('cd', 123);
> 1 row affected (0.005 seconds)
> 0: jdbc:phoenix:localhost> upsert into T1 values ('cd', 123) on duplicate key 
> ignore;
> 1 row affected (0.008 seconds){code}
> Even when the row already exists, we return “1” row updated.
> {code:java}
> 0: jdbc:phoenix:localhost> upsert into T1 values ('cd', 123) on duplicate key 
> update
> val=val;
> 1 row affected (0.01 seconds) {code}
> In this case, the value of column ‘val’ does not change so we could return 
> “0“ to denote that fact. I mentioned ”could“ because as per the current 
> implementation even though from the application perspective the value of the 
> column is the same, from HBase perspective we are doing another PUT mutation 
> which adds another version to the underlying cell and updates the cell 
> timestamp. We also update the timestamp of the empty cell. So, technically 
> this is an update from HBase perspective.
> Referring MYSQL which has similar conditional update constructs, its 
> documentation says: “ With ON DUPLICATE KEY UPDATE, the affected-rows value 
> per row is 1 if the row is inserted as a new row, 2 if an existing row is 
> updated, and 0 if an existing row is set to its current values.”



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-6714) Return update status from Conditional Upserts

2024-07-02 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-6714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-6714:
--
Fix Version/s: 5.2.1
   5.3.0

> Return update status from Conditional Upserts
> -
>
> Key: PHOENIX-6714
> URL: https://issues.apache.org/jira/browse/PHOENIX-6714
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Tanuj Khurana
>Assignee: Jing Yu
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> {code:java}
> 0: jdbc:phoenix:localhost> upsert into T1 values ('cd', 123);
> 1 row affected (0.005 seconds)
> 0: jdbc:phoenix:localhost> upsert into T1 values ('cd', 123) on duplicate key 
> ignore;
> 1 row affected (0.008 seconds){code}
> Even when the row already exists, we return “1” row updated.
> {code:java}
> 0: jdbc:phoenix:localhost> upsert into T1 values ('cd', 123) on duplicate key 
> update
> val=val;
> 1 row affected (0.01 seconds) {code}
> In this case, the value of column ‘val’ does not change so we could return 
> “0“ to denote that fact. I mentioned ”could“ because as per the current 
> implementation even though from the application perspective the value of the 
> column is the same, from HBase perspective we are doing another PUT mutation 
> which adds another version to the underlying cell and updates the cell 
> timestamp. We also update the timestamp of the empty cell. So, technically 
> this is an update from HBase perspective.
> Referring MYSQL which has similar conditional update constructs, its 
> documentation says: “ With ON DUPLICATE KEY UPDATE, the affected-rows value 
> per row is 1 if the row is inserted as a new row, 2 if an existing row is 
> updated, and 0 if an existing row is set to its current values.”



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (PHOENIX-7316) Need close more Statements

2024-06-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved PHOENIX-7316.
---
Resolution: Fixed

> Need close more Statements
> --
>
> Key: PHOENIX-7316
> URL: https://issues.apache.org/jira/browse/PHOENIX-7316
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0
>Reporter: chaijunjie
>Assignee: chaijunjie
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> After PHOENIX-6560, we find there are also many Statement not closed...
> 1. org.apache.phoenix.schema.MetaDataClient
> incrementStatement
> linkStatement
> setAsync
> setSync
> tableUpsert
> linkStatement
> incrementStatement
> tableUpsert
> 2. org.apache.phoenix.schema.task.Task
> org.apache.phoenix.schema.task.Task#populateTasks
> org.apache.phoenix.schema.task.Task#executeStatementAndGetTaskMutations
> 3. 
> org.apache.phoenix.jdbc.PhoenixStatement.ExecutableShowTablesStatement#compilePlan
> 4. org.apache.phoenix.trace.PhoenixMetricsSink
> org.apache.phoenix.trace.PhoenixMetricsSink#createTable
> org.apache.phoenix.trace.PhoenixMetricsSink#putMetrics
> 5. org.apache.phoenix.trace.TraceWriter.FlushMetrics#addToBatch
> 6. org.apache.phoenix.trace.TraceWriter#createTable
> 7. org.apache.phoenix.trace.TraceReader#readAll
> 8. 
> org.apache.phoenix.mapreduce.index.automation.PhoenixMRJobSubmitter#getCandidateJobs
>  
> so may be right, i will raise a PR try to fix all



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7316) Need close more Statements

2024-06-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7316:
--
Fix Version/s: 5.2.1
   5.3.0

> Need close more Statements
> --
>
> Key: PHOENIX-7316
> URL: https://issues.apache.org/jira/browse/PHOENIX-7316
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.2.0
>Reporter: chaijunjie
>Assignee: chaijunjie
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> After PHOENIX-6560, we find there are also many Statement not closed...
> 1. org.apache.phoenix.schema.MetaDataClient
> incrementStatement
> linkStatement
> setAsync
> setSync
> tableUpsert
> linkStatement
> incrementStatement
> tableUpsert
> 2. org.apache.phoenix.schema.task.Task
> org.apache.phoenix.schema.task.Task#populateTasks
> org.apache.phoenix.schema.task.Task#executeStatementAndGetTaskMutations
> 3. 
> org.apache.phoenix.jdbc.PhoenixStatement.ExecutableShowTablesStatement#compilePlan
> 4. org.apache.phoenix.trace.PhoenixMetricsSink
> org.apache.phoenix.trace.PhoenixMetricsSink#createTable
> org.apache.phoenix.trace.PhoenixMetricsSink#putMetrics
> 5. org.apache.phoenix.trace.TraceWriter.FlushMetrics#addToBatch
> 6. org.apache.phoenix.trace.TraceWriter#createTable
> 7. org.apache.phoenix.trace.TraceReader#readAll
> 8. 
> org.apache.phoenix.mapreduce.index.automation.PhoenixMRJobSubmitter#getCandidateJobs
>  
> so may be right, i will raise a PR try to fix all



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PHOENIX-7339) HBase flushes with custom clock needs to disable remote procedure delay

2024-06-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7339:
--
Issue Type: Test  (was: Bug)

> HBase flushes with custom clock needs to disable remote procedure delay
> ---
>
> Key: PHOENIX-7339
> URL: https://issues.apache.org/jira/browse/PHOENIX-7339
> Project: Phoenix
>  Issue Type: Test
>Reporter: Istvan Toth
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> The Job takes ~3 hours with 2.4 , ~3.5 hours with 2.5 and is interrupted 
> after 5 hours with 2.6.
> While I did not see OOM errors, this could still be GC thrashing, as newer 
> HBase / Hadoop version use more heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (PHOENIX-7339) HBase flushes with custom clock needs to disable remote procedure delay

2024-06-24 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7339:
-

Assignee: Viraj Jasani

> HBase flushes with custom clock needs to disable remote procedure delay
> ---
>
> Key: PHOENIX-7339
> URL: https://issues.apache.org/jira/browse/PHOENIX-7339
> Project: Phoenix
>  Issue Type: Test
>Reporter: Istvan Toth
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 5.2.1, 5.3.0
>
>
> The Job takes ~3 hours with 2.4 , ~3.5 hours with 2.5 and is interrupted 
> after 5 hours with 2.6.
> While I did not see OOM errors, this could still be GC thrashing, as newer 
> HBase / Hadoop version use more heap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >