armitage420 commented on code in PR #5539:
URL: https://github.com/apache/hive/pull/5539#discussion_r1915361539
##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -4466,6 +4526,71 @@ public List<Partition> getPartitionsByFilter(Table tbl,
String filter)
return convertFromMetastore(tbl, tParts);
}
+ public List<Partition> getPartitionsWithSpecs(Table tbl,
GetPartitionsRequest request)
+ throws HiveException, TException {
+
+ if (!tbl.isPartitioned()) {
+ throw new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED,
tbl.getTableName());
+ }
+ int batchSize= MetastoreConf.getIntVar(Hive.get().getConf(),
MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX);
+ if(batchSize > 0){
+ return new ArrayList<>(getAllPartitionsWithSpecsInBatches(tbl,
batchSize, DEFAULT_BATCH_DECAYING_FACTOR, MetastoreConf.getIntVar(
+ Hive.get().getConf(),
MetastoreConf.ConfVars.GETPARTITIONS_BATCH_MAX_RETRIES), request));
+ }else{
+ return getPartitionsWithSpecsInternal(tbl, request);
+ }
+ }
+
+ public List<Partition> getPartitionsWithSpecsInternal(Table tbl,
GetPartitionsRequest request)
+ throws HiveException, TException {
+
+ if (!tbl.isPartitioned()) {
+ throw new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED,
tbl.getTableName());
+ }
+ GetPartitionsResponse response = getMSC().getPartitionsWithSpecs(request);
+ List<org.apache.hadoop.hive.metastore.api.PartitionSpec> partitionSpecs =
response.getPartitionSpec();
+ List<Partition> partitions = new ArrayList<>();
+ partitions.addAll(convertFromPartSpec(partitionSpecs.iterator(), tbl));
+
+ return partitions;
+ }
+
+ List<Partition> getPartitionsWithSpecsByNames(Table tbl, List<String>
partNames, GetPartitionsRequest request)
Review Comment:
This particular case scenario is made for huge partitioned tables where our
thrift network would hit a 2GB data threshold.
We might need to change the value METASTORE_BATCH_RETRIEVE_MAX in order to
benefit from this very method. We will have to choose a very approximate value
for the same though. As, I am not able to come up with a real time calculation
of max batchsize required. And the data size now is going to be pretty dynamic
with dynamic projections for partitions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]