[hive] branch master updated (2d1bf27 -> 2806d5f)

2021-07-19 Thread aasha
This is an automated email from the ASF dual-hosted git repository.

aasha pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.


from 2d1bf27  HIVE-25190: Fix many small allocations in BytesColumnVector
 add 2806d5f  HIVE-22626. Fix Replication related tests. (#2452)(Ayush 
Saxena, reviewed by Arko Sharma)

No new revisions were added by this update.

Summary of changes:
 .../org/apache/hadoop/hive/metastore/TestReplChangeManager.java | 4 ++--
 .../apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java | 6 +++---
 .../hadoop/hive/ql/parse/BaseReplicationScenariosAcidTables.java| 2 +-
 .../hadoop/hive/ql/parse/TestMetaStoreEventListenerInRepl.java  | 2 +-
 .../apache/hadoop/hive/ql/parse/TestReplicationOfHiveStreaming.java | 2 +-
 .../hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java   | 2 +-
 .../hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java| 2 +-
 .../ql/parse/TestReplicationScenariosIncrementalLoadAcidTables.java | 2 +-
 .../hadoop/hive/ql/parse/TestScheduledReplicationScenarios.java | 2 +-
 .../apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java  | 3 ++-
 .../hadoop/hive/ql/parse/TestStatsReplicationScenariosACID.java | 1 -
 .../ql/parse/TestStatsReplicationScenariosACIDNoAutogather.java | 1 -
 .../hive/ql/parse/TestStatsReplicationScenariosMMNoAutogather.java  | 1 -
 .../hadoop/hive/ql/txn/compactor/TestCleanerWithReplication.java| 2 +-
 14 files changed, 15 insertions(+), 17 deletions(-)


[hive] branch storage-branch-2.7 updated: HIVE-25190: Fix many small allocations in BytesColumnVector

2021-07-19 Thread omalley
This is an automated email from the ASF dual-hosted git repository.

omalley pushed a commit to branch storage-branch-2.7
in repository https://gitbox.apache.org/repos/asf/hive.git


The following commit(s) were added to refs/heads/storage-branch-2.7 by this 
push:
 new 89eaded  HIVE-25190: Fix many small allocations in BytesColumnVector
89eaded is described below

commit 89eadedb89c098b1c66e8d21e74a13ffdc6dc74d
Author: Owen O'Malley 
AuthorDate: Fri Jun 18 16:30:13 2021 -0700

HIVE-25190: Fix many small allocations in BytesColumnVector

Fixes #2408

Signed-off-by: Owen O'Malley 
---
 storage-api/pom.xml|   2 +-
 .../hive/ql/exec/vector/BytesColumnVector.java | 161 ++---
 .../hive/ql/exec/vector/TestBytesColumnVector.java | 125 ++--
 3 files changed, 187 insertions(+), 101 deletions(-)

diff --git a/storage-api/pom.xml b/storage-api/pom.xml
index de68292..9ec0890 100644
--- a/storage-api/pom.xml
+++ b/storage-api/pom.xml
@@ -178,7 +178,7 @@
 2.19.1
 
   false
-  -Xmx2048m
+  -Xmx3g
   false
   
 ${project.build.directory}/tmp
diff --git 
a/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
 
b/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
index e386109..4782661 100644
--- 
a/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
+++ 
b/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
@@ -46,14 +46,15 @@ public class BytesColumnVector extends ColumnVector {
*/
   public int[] length;
 
-  // A call to increaseBufferSpace() or ensureValPreallocated() will ensure 
that buffer[] points to
-  // a byte[] with sufficient space for the specified size.
-  private byte[] buffer;   // optional buffer to use when actually copying in 
data
-  private int nextFree;// next free position in buffer
+  // Calls to ensureValPreallocated() ensure that currentValue and 
currentOffset
+  // are set to enough space for the value.
+  private byte[] currentValue;   // bytes for the next value
+  private int currentOffset;// starting position in the current buffer
 
-  // Hang onto a byte array for holding smaller byte values
-  private byte[] smallBuffer;
-  private int smallBufferNextFree;
+  // A shared static buffer allocation that we use for the small values
+  private byte[] sharedBuffer;
+  // The next unused offset in the sharedBuffer.
+  private int sharedBufferOffset;
 
   private int bufferAllocationCount;
 
@@ -63,8 +64,11 @@ public class BytesColumnVector extends ColumnVector {
   // Proportion of extra space to provide when allocating more buffer space.
   static final float EXTRA_SPACE_FACTOR = (float) 1.2;
 
-  // Largest size allowed in smallBuffer
-  static final int MAX_SIZE_FOR_SMALL_BUFFER = 1024 * 1024;
+  // Largest item size allowed in sharedBuffer
+  static final int MAX_SIZE_FOR_SMALL_ITEM = 1024 * 1024;
+
+  // Largest size allowed for sharedBuffer
+  static final int MAX_SIZE_FOR_SHARED_BUFFER = 1024 * 1024 * 1024;
 
   /**
* Use this constructor for normal operation.
@@ -118,29 +122,29 @@ public class BytesColumnVector extends ColumnVector {
* Provide the estimated number of bytes needed to hold
* a full column vector worth of byte string data.
*
-   * @param estimatedValueSize  Estimated size of buffer space needed
+   * @param estimatedValueSize  Estimated size of buffer space needed per row
*/
   public void initBuffer(int estimatedValueSize) {
-nextFree = 0;
-smallBufferNextFree = 0;
+sharedBufferOffset = 0;
 
 // if buffer is already allocated, keep using it, don't re-allocate
-if (buffer != null) {
+if (sharedBuffer != null) {
   // Free up any previously allocated buffers that are referenced by vector
   if (bufferAllocationCount > 0) {
 for (int idx = 0; idx < vector.length; ++idx) {
   vector[idx] = null;
 }
-buffer = smallBuffer; // In case last row was a large bytes value
   }
 } else {
   // allocate a little extra space to limit need to re-allocate
-  int bufferSize = this.vector.length * (int)(estimatedValueSize * 
EXTRA_SPACE_FACTOR);
+  long bufferSize = (long) (this.vector.length * estimatedValueSize * 
EXTRA_SPACE_FACTOR);
   if (bufferSize < DEFAULT_BUFFER_SIZE) {
 bufferSize = DEFAULT_BUFFER_SIZE;
   }
-  buffer = new byte[bufferSize];
-  smallBuffer = buffer;
+  if (bufferSize > MAX_SIZE_FOR_SHARED_BUFFER) {
+bufferSize = MAX_SIZE_FOR_SHARED_BUFFER;
+  }
+  sharedBuffer = new byte[(int) bufferSize];
 }
 bufferAllocationCount = 0;
   }
@@ -156,10 +160,7 @@ public class BytesColumnVector extends ColumnVector {
* @return amount of buffer space currently allocated
*/
   public int bufferSize() {
-if (buffer == null) {
-  

[hive] branch storage-branch-2.8 updated: HIVE-25190: Fix many small allocations in BytesColumnVector

2021-07-19 Thread omalley
This is an automated email from the ASF dual-hosted git repository.

omalley pushed a commit to branch storage-branch-2.8
in repository https://gitbox.apache.org/repos/asf/hive.git


The following commit(s) were added to refs/heads/storage-branch-2.8 by this 
push:
 new b304acd  HIVE-25190: Fix many small allocations in BytesColumnVector
b304acd is described below

commit b304acd36334e54d276e9ded5851ae24d2f23595
Author: Owen O'Malley 
AuthorDate: Fri Jun 18 16:30:13 2021 -0700

HIVE-25190: Fix many small allocations in BytesColumnVector

Fixes #2408

Signed-off-by: Owen O'Malley 
---
 storage-api/pom.xml|   2 +-
 .../hive/ql/exec/vector/BytesColumnVector.java | 163 ++---
 .../hive/ql/exec/vector/TestBytesColumnVector.java | 124 ++--
 3 files changed, 187 insertions(+), 102 deletions(-)

diff --git a/storage-api/pom.xml b/storage-api/pom.xml
index 53fa3c0..c87aed7 100644
--- a/storage-api/pom.xml
+++ b/storage-api/pom.xml
@@ -185,7 +185,7 @@
 3.0.0-M4
 
   false
-  -Xmx2048m
+  -Xmx3g
   false
   
 ${project.build.directory}/tmp
diff --git 
a/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
 
b/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
index 6618807..a8c58ac 100644
--- 
a/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
+++ 
b/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
@@ -49,14 +49,15 @@ public class BytesColumnVector extends ColumnVector {
*/
   public int[] length;
 
-  // A call to increaseBufferSpace() or ensureValPreallocated() will ensure 
that buffer[] points to
-  // a byte[] with sufficient space for the specified size.
-  private byte[] buffer;   // optional buffer to use when actually copying in 
data
-  private int nextFree;// next free position in buffer
+  // Calls to ensureValPreallocated() ensure that currentValue and 
currentOffset
+  // are set to enough space for the value.
+  private byte[] currentValue;   // bytes for the next value
+  private int currentOffset;// starting position in the current buffer
 
-  // Hang onto a byte array for holding smaller byte values
-  private byte[] smallBuffer;
-  private int smallBufferNextFree;
+  // A shared static buffer allocation that we use for the small values
+  private byte[] sharedBuffer;
+  // The next unused offset in the sharedBuffer.
+  private int sharedBufferOffset;
 
   private int bufferAllocationCount;
 
@@ -66,8 +67,11 @@ public class BytesColumnVector extends ColumnVector {
   // Proportion of extra space to provide when allocating more buffer space.
   static final float EXTRA_SPACE_FACTOR = (float) 1.2;
 
-  // Largest size allowed in smallBuffer
-  static final int MAX_SIZE_FOR_SMALL_BUFFER = 1024 * 1024;
+  // Largest item size allowed in sharedBuffer
+  static final int MAX_SIZE_FOR_SMALL_ITEM = 1024 * 1024;
+
+  // Largest size allowed for sharedBuffer
+  static final int MAX_SIZE_FOR_SHARED_BUFFER = 1024 * 1024 * 1024;
 
   /**
* Use this constructor for normal operation.
@@ -121,30 +125,30 @@ public class BytesColumnVector extends ColumnVector {
* Provide the estimated number of bytes needed to hold
* a full column vector worth of byte string data.
*
-   * @param estimatedValueSize  Estimated size of buffer space needed
+   * @param estimatedValueSize  Estimated size of buffer space needed per row
*/
   public void initBuffer(int estimatedValueSize) {
-nextFree = 0;
-smallBufferNextFree = 0;
+sharedBufferOffset = 0;
 
 // if buffer is already allocated, keep using it, don't re-allocate
-if (buffer != null) {
+if (sharedBuffer != null) {
   // Free up any previously allocated buffers that are referenced by vector
   if (bufferAllocationCount > 0) {
 for (int idx = 0; idx < vector.length; ++idx) {
   vector[idx] = null;
   length[idx] = 0;
 }
-buffer = smallBuffer; // In case last row was a large bytes value
   }
 } else {
   // allocate a little extra space to limit need to re-allocate
-  int bufferSize = this.vector.length * (int)(estimatedValueSize * 
EXTRA_SPACE_FACTOR);
+  long bufferSize = (long) (this.vector.length * estimatedValueSize * 
EXTRA_SPACE_FACTOR);
   if (bufferSize < DEFAULT_BUFFER_SIZE) {
 bufferSize = DEFAULT_BUFFER_SIZE;
   }
-  buffer = new byte[bufferSize];
-  smallBuffer = buffer;
+  if (bufferSize > MAX_SIZE_FOR_SHARED_BUFFER) {
+bufferSize = MAX_SIZE_FOR_SHARED_BUFFER;
+  }
+  sharedBuffer = new byte[(int) bufferSize];
 }
 bufferAllocationCount = 0;
   }
@@ -160,10 +164,7 @@ public class BytesColumnVector extends ColumnVector {
* @return amount of buffer space currently allocated
*/
   public int bufferSize() {
-if 

[hive] branch master updated (7553a60 -> 2d1bf27)

2021-07-19 Thread omalley
This is an automated email from the ASF dual-hosted git repository.

omalley pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.


from 7553a60  HIVE-25336. Use single call to get tables in 
DropDatabaseAnalyzer. (#2481)(Ayush Saxena, reviewed by Miklos Gergley)
 add 2d1bf27  HIVE-25190: Fix many small allocations in BytesColumnVector

No new revisions were added by this update.

Summary of changes:
 storage-api/pom.xml|   2 +-
 .../hive/ql/exec/vector/BytesColumnVector.java | 163 ++---
 .../hive/ql/exec/vector/TestBytesColumnVector.java | 124 ++--
 3 files changed, 187 insertions(+), 102 deletions(-)


[hive] branch master updated (273593e -> 7553a60)

2021-07-19 Thread aasha
This is an automated email from the ASF dual-hosted git repository.

aasha pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.


from 273593e  HIVE-25209: SELECT query with SUM function producing 
unexpected result (#2360) (Soumyakanti Das reviewed by Zoltan Haindrich)
 add 7553a60  HIVE-25336. Use single call to get tables in 
DropDatabaseAnalyzer. (#2481)(Ayush Saxena, reviewed by Miklos Gergley)

No new revisions were added by this update.

Summary of changes:
 .../hive/ql/ddl/database/drop/DropDatabaseAnalyzer.java  | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)


[hive] branch master updated (ff27abe -> 273593e)

2021-07-19 Thread kgyrtkirk
This is an automated email from the ASF dual-hosted git repository.

kgyrtkirk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.


from ff27abe  HIVE-25311: Slow compilation of union operators with >100 
branches (#2456) (Zoltan Haindrich reviewed by Krisztian Kasa)
 add 273593e  HIVE-25209: SELECT query with SUM function producing 
unexpected result (#2360) (Soumyakanti Das reviewed by Zoltan Haindrich)

No new revisions were added by this update.

Summary of changes:
 .../hadoop/hive/ql/optimizer/StatsOptimizer.java   |  6 ++
 ql/src/test/queries/clientpositive/select_sum.q| 17 
 .../results/clientpositive/llap/select_sum.q.out   | 90 ++
 3 files changed, 113 insertions(+)
 create mode 100644 ql/src/test/queries/clientpositive/select_sum.q
 create mode 100644 ql/src/test/results/clientpositive/llap/select_sum.q.out


[hive] branch master updated (66c72f7 -> ff27abe)

2021-07-19 Thread kgyrtkirk
This is an automated email from the ASF dual-hosted git repository.

kgyrtkirk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.


from 66c72f7  HIVE-25256: Support ALTER TABLE CHANGE COLUMN for Iceberg 
(Marton Bod, reviewed by Peter Vary and Adam Szita)
 add ff27abe  HIVE-25311: Slow compilation of union operators with >100 
branches (#2456) (Zoltan Haindrich reviewed by Krisztian Kasa)

No new revisions were added by this update.

Summary of changes:
 .../apache/hadoop/hive/ql/parse/GenTezUtils.java   | 74 +-
 1 file changed, 72 insertions(+), 2 deletions(-)


[hive] branch master updated (c10de61 -> 66c72f7)

2021-07-19 Thread szita
This is an automated email from the ASF dual-hosted git repository.

szita pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git.


from c10de61  HIVE-25253: Incremental rebuild of partitioned insert only 
materialized views (Krisztian Kasa, reviewed by Jesus Camacho Rodriguez, Zoltan 
Haindrich)
 add 66c72f7  HIVE-25256: Support ALTER TABLE CHANGE COLUMN for Iceberg 
(Marton Bod, reviewed by Peter Vary and Adam Szita)

No new revisions were added by this update.

Summary of changes:
 .../src/test/results/negative/hbase_ddl.q.out  |  2 +-
 .../org/apache/iceberg/hive/HiveSchemaUtil.java| 28 +++
 .../iceberg/mr/hive/HiveIcebergMetaHook.java   | 89 +++---
 .../iceberg/mr/hive/HiveIcebergStorageHandler.java |  8 +-
 .../hive/TestHiveIcebergStorageHandlerNoScan.java  | 60 +++
 .../TestHiveIcebergStorageHandlerWithEngine.java   | 31 
 .../ddl/table/AbstractBaseAlterTableAnalyzer.java  |  4 +-
 .../hadoop/hive/ql/ddl/table/AlterTableType.java   |  2 +-
 .../hive/ql/metadata/HiveStorageHandler.java   | 16 
 .../results/clientnegative/alter_non_native.q.out  |  2 +-
 10 files changed, 227 insertions(+), 15 deletions(-)