[jira] [Created] (HBASE-25859) Reference class incorrectly parses the protobuf magic marker

2021-05-06 Thread Constantin-Catalin Luca (Jira)
Constantin-Catalin Luca created HBASE-25859:
---

 Summary: Reference class incorrectly parses the protobuf magic 
marker
 Key: HBASE-25859
 URL: https://issues.apache.org/jira/browse/HBASE-25859
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.4.1
Reporter: Constantin-Catalin Luca
Assignee: Constantin-Catalin Luca


The Reference class incorrectly parses the protobuf magic marker.

It uses:
`DataInputStream.read(byte[lengthOfPNMagic])` but this call does not guarantee 
to read all the bytes the marker.
The fix is the same as the one for 
https://issues.apache.org/jira/browse/HBASE-25674



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-25674) RegionInfo.parseFrom(DataInputStream) does not read correc

2021-03-17 Thread Constantin-Catalin Luca (Jira)
Constantin-Catalin Luca created HBASE-25674:
---

 Summary: RegionInfo.parseFrom(DataInputStream) does not read correc
 Key: HBASE-25674
 URL: https://issues.apache.org/jira/browse/HBASE-25674
 Project: HBase
  Issue Type: Bug
Reporter: Constantin-Catalin Luca






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24541) Add support to run LoadIncrementalHFiles in a distributed manner

2020-06-11 Thread Constantin-Catalin Luca (Jira)
Constantin-Catalin Luca created HBASE-24541:
---

 Summary: Add support to run LoadIncrementalHFiles in a distributed 
manner
 Key: HBASE-24541
 URL: https://issues.apache.org/jira/browse/HBASE-24541
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce, Performance
Affects Versions: 1.4.0
Reporter: Constantin-Catalin Luca


LoadIncrementalHFiles takes a very long time to complete when running HBase on 
top of S3 and attempting to bulkload 500K-700K files.

The root cause of this is a combination of the higher latency of S3 (as 
compared to HDFS) as well as the calls made by LoadIncrementalHFiles to the 
underlying filesystem(each file is opened, seeked to the trailer offset at the 
end, and then the trailer is read).

Increasing the parallelism does not yield any significant improvement. This 
seems to stem from the fact that once the trailer is read the stream is not 
consumed to the end. This causes the underlying HTTP connection to be aborted 
and it cannot be re-used.

 

The proposed solution would be to also add support to run LoadIncrementalHFiles 
on multiple machines as a map reduce job. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)