[jira] [Updated] (HBASE-5640) bulk load runs slowly than before

2012-04-05 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5640:


Attachment: bulkLoadFs2.txt

It is better to compare the URIs that to use object equality. The object 
equality does not work because one object is of type FileSystem while the other 
object is a HFileSystem.

 bulk load runs slowly than before
 -

 Key: HBASE-5640
 URL: https://issues.apache.org/jira/browse/HBASE-5640
 Project: HBase
  Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
  Labels: bulkloader
 Attachments: bulkLoadFs1.txt, bulkLoadFs2.txt


 I am loading data from an external system into hbase. There are many prints 
 of the form. This is possibly a regression caused by a recent patch.
 on different filesystem than destination store - moving to this filesystem

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5640) bulk load runs slowly than before

2012-03-28 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5640:


Attachment: bulkLoadFs1.txt

This is the fix I have in mind but have not yet tested in great detail.

 bulk load runs slowly than before
 -

 Key: HBASE-5640
 URL: https://issues.apache.org/jira/browse/HBASE-5640
 Project: HBase
  Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
  Labels: bulkloader
 Attachments: bulkLoadFs1.txt


 I am loading data from an external system into hbase. There are many prints 
 of the form. This is possibly a regression caused by a recent patch.
 on different filesystem than destination store - moving to this filesystem

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5074) support checksums in HBase block cache

2012-03-06 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5074:


Status: Open  (was: Patch Available)

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.14.patch, D1521.14.patch, 
 D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, 
 D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, 
 D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, 
 D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5074) support checksums in HBase block cache

2012-03-06 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5074:


Status: Patch Available  (was: Open)

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, 
 D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, 
 D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, 
 D1521.13.patch, D1521.13.patch, D1521.14.patch, D1521.14.patch, 
 D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, 
 D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, 
 D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, 
 D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5473) Metrics does not push pread time

2012-02-24 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5473:


Assignee: dhruba borthakur
  Status: Patch Available  (was: Open)

 Metrics does not push pread time
 

 Key: HBASE-5473
 URL: https://issues.apache.org/jira/browse/HBASE-5473
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor

 The RegionServerMetrics is not pushing the pread times to the MetricsRecord

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4658) Put attributes are not exposed via the ThriftServer

2012-02-06 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4658:


   Resolution: Fixed
Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 Put attributes are not exposed via the ThriftServer
 ---

 Key: HBASE-4658
 URL: https://issues.apache.org/jira/browse/HBASE-4658
 Project: HBase
  Issue Type: Bug
  Components: thrift
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: D1563.1.patch, D1563.1.patch, D1563.1.patch, 
 D1563.2.patch, D1563.2.patch, D1563.2.patch, D1563.3.patch, D1563.3.patch, 
 D1563.3.patch, ThriftPutAttributes1.txt


 The Put api also takes in a bunch of arbitrary attributes that an application 
 can use to associate metadata with each put operation. This is not exposed 
 via Thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4658) Put attributes are not exposed via the ThriftServer

2012-02-02 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4658:


Status: Patch Available  (was: Open)

 Put attributes are not exposed via the ThriftServer
 ---

 Key: HBASE-4658
 URL: https://issues.apache.org/jira/browse/HBASE-4658
 Project: HBase
  Issue Type: Bug
  Components: thrift
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1563.1.patch, D1563.1.patch, D1563.1.patch, 
 D1563.2.patch, D1563.2.patch, D1563.2.patch, D1563.3.patch, D1563.3.patch, 
 D1563.3.patch, ThriftPutAttributes1.txt


 The Put api also takes in a bunch of arbitrary attributes that an application 
 can use to associate metadata with each put operation. This is not exposed 
 via Thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5074) support checksums in HBase block cache

2012-01-29 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5074:


Status: Open  (was: Patch Available)

This patch is not yet ready for submission. It needs enhancement with a unit 
test and metrics collection.

 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5295) Improve the Thrift API to switch on/off writing to wal for Mutations

2012-01-28 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5295:


Status: Patch Available  (was: Open)

 Improve the Thrift API  to switch on/off writing to wal for Mutations
 -

 Key: HBASE-5295
 URL: https://issues.apache.org/jira/browse/HBASE-5295
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1515.1.patch, D1515.1.patch, D1515.1.patch, 
 D1515.1.patch


 The thrift api currently does not support switching off updating wal for 
 Puts/Deletes. Support it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-16 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Status: Open  (was: Patch Available)

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-16 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Status: Patch Available  (was: Open)

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-16 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Attachment: scannerMVCC1.txt

Attaching the same patch file again

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5014) PutSortReducer and KeyValueSortReduce should adhere to memory limits

2011-12-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5014:


Attachment: putSortReducer1.txt

Attached patch from Review.

 PutSortReducer and KeyValueSortReduce should adhere to memory limits
 

 Key: HBASE-5014
 URL: https://issues.apache.org/jira/browse/HBASE-5014
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: putSortReducer1.txt


 The PutSortReduce class has a configurable threshold to flush partial sorted 
 data for large rows. However, it was not using the size of the key in the 
 calculation of overall memory used. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Attachment: scannerMVCC1.txt

Attached patch from review.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Status: Open  (was: Patch Available)

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Status: Patch Available  (was: Open)

Submitting patch again, hoping that it will be picked up by committers and 
automatic build testing.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5014) PutSortReducer and KeyValueSortReduce should adhere to memory limits

2011-12-14 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-5014:


Status: Patch Available  (was: Open)

Hi Kannan, I addressed your comment and ran all unit tests.

 PutSortReducer and KeyValueSortReduce should adhere to memory limits
 

 Key: HBASE-5014
 URL: https://issues.apache.org/jira/browse/HBASE-5014
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The PutSortReduce class has a configurable threshold to flush partial sorted 
 data for large rows. However, it was not using the size of the key in the 
 calculation of overall memory used. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-14 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4938:


Status: Patch Available  (was: Open)

I have run all the unit tests for this one.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor

 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4989) Metrics to measure sequential reads and random reads separately

2011-12-11 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4989:


Attachment: metrics1.txt

Patchfor trunk.

 Metrics to measure sequential reads and random reads separately
 ---

 Key: HBASE-4989
 URL: https://issues.apache.org/jira/browse/HBASE-4989
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: metrics1.txt


 HBase does sequential reads for compactions and positional random reads for 
 satisfying user's queries. It would be nice if we can measure their latencies 
 separately. It is mostly the random reads that dominate a transactional 
 workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4989) Metrics to measure sequential reads and random reads separately

2011-12-11 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4989:


Release Note: The metric fsReadLatency records the number of sequential 
reads. The metric fsPreadLatency records the number of random reads.
Hadoop Flags: Incompatible change
  Status: Patch Available  (was: Open)

 Metrics to measure sequential reads and random reads separately
 ---

 Key: HBASE-4989
 URL: https://issues.apache.org/jira/browse/HBASE-4989
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: metrics1.txt


 HBase does sequential reads for compactions and positional random reads for 
 satisfying user's queries. It would be nice if we can measure their latencies 
 separately. It is mostly the random reads that dominate a transactional 
 workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-22 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut7.txt

Addressed feedback comments from RamKrishna and Jonathan. Fixed unit test to 
not assert erroneously. Enhanced Memstore.rollback() to rollback keys from 
memstore and snapshot.


 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt, appendNoSyncPut6.txt, appendNoSyncPut7.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-22 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut8.txt

All unit tests (except DistributedLogSplitting) passes with this patch

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt, appendNoSyncPut6.txt, appendNoSyncPut7.txt, 
 appendNoSyncPut8.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-22 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Status: Patch Available  (was: Open)

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt, appendNoSyncPut6.txt, appendNoSyncPut7.txt, 
 appendNoSyncPut8.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-18 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut6.txt

After some discussion with Ted and JGray, we decided that the rowlock is not 
necessary for HRegion.rollbackMemstore. This is the version of the patch that 
should satisfy all parties. Please review.

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt, appendNoSyncPut6.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers

2011-10-16 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4588:


Attachment: configVerify2.txt

Addressed Ted's review comments.

 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers
 --

 Key: HBASE-4588
 URL: https://issues.apache.org/jira/browse/HBASE-4588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.92.0

 Attachments: configVerify1.txt, configVerify2.txt


 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers.
 On our cluster, we had block cache = 0.6 and memstore = 0.2.  It was saying 
 this was  0.8 when it is actually equal.
 Minor bug but annoying nonetheless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers

2011-10-16 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4588:


Attachment: configVerify2.txt

Attaching the appropriate patch file with review comments fixes.

 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers
 --

 Key: HBASE-4588
 URL: https://issues.apache.org/jira/browse/HBASE-4588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.92.0

 Attachments: configVerify1.txt, configVerify2.txt, configVerify2.txt


 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers.
 On our cluster, we had block cache = 0.6 and memstore = 0.2.  It was saying 
 this was  0.8 when it is actually equal.
 Minor bug but annoying nonetheless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-16 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut5.txt

Fixed typos.

Performance numbers return on hbase-92 with a variant of hdfs 0.20.


vanilla hdfs: 1200 put/sec (no patch),
  5000 puts/sec (with patch)
synconsync hdfs : 80 put/sec (no patch)

The synconsync-version-of-hdfs is an internal version  of hdfs that makes the 
datanode issue a sync() on the corresponding ext3 block file for every 
invocation of DFSClient.sync(). This ensures that a hbase transaction is 
really,really on disk before the put rpc returns to the client.


 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt, 
 appendNoSyncPut5.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers

2011-10-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4588:


Attachment: configVerify1.txt

Convert floating point numbers to integers so that we use integer-comparision 
instead of floating point comparision. This fix has been deployed to some of 
our 0.92 clusters.

 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers
 --

 Key: HBASE-4588
 URL: https://issues.apache.org/jira/browse/HBASE-4588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.92.0

 Attachments: configVerify1.txt


 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers.
 On our cluster, we had block cache = 0.6 and memstore = 0.2.  It was saying 
 this was  0.8 when it is actually equal.
 Minor bug but annoying nonetheless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers

2011-10-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4588:


Status: Patch Available  (was: Open)

 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers
 --

 Key: HBASE-4588
 URL: https://issues.apache.org/jira/browse/HBASE-4588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jonathan Gray
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.92.0

 Attachments: configVerify1.txt


 The floating point arithmetic to validate memory allocation configurations 
 need to be done as integers.
 On our cluster, we had block cache = 0.6 and memstore = 0.2.  It was saying 
 this was  0.8 when it is actually equal.
 Minor bug but annoying nonetheless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-15 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSync5.txt

Addressed Kannans, ted and Gary review comments. Changed name of method to 
rollbackMemstore. And the rollback method now compare memstoreTS before 
deleting the key. 

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync5.txt, appendNoSyncPut1.txt, 
 appendNoSyncPut2.txt, appendNoSyncPut3.txt, appendNoSyncPut4.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-06 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut3.txt

1. The flush of memstore waits for current transactions to quiesce before 
committing the flushed files. This should address the problem pointed out by 
Kannan.

2. The Hlog.syncer() does not throw an exception, instead causes the 
regionserver to exit if it is unable to sync to hdfs. The assumption here is 
that if hbase is unable to write/sync to hdfs, then the simplest and correct 
error recovery is to exit. (For example, if the memstore flush fails, the 
regionserver exits)



 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt, 
 appendNoSyncPut3.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-02 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut1.txt

The changes the multiPut operation so that the sync to the wal occurs outside 
the rowlock. 

This enhancement is done only to HRegion.mut(Put[]) because this is the only 
method that gets invoked from an application. The HRegion.put(Put) is used only 
by unit tests and should possibly be deprecated.

I have attached a unit test. I have not yet run all unit tests, but early 
feedback on this patch will be very helpful.


 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4528) The put operation can release the rowlock before sync-ing the Hlog

2011-10-02 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4528:


Attachment: appendNoSyncPut2.txt

Incorporated most of Ted's comments and all of Lar's comments.

I did not change the order of advancing the rwcc first before releasing the 
rowlock in the finally-clause because this will occur only in some error case, 
and in that case it might be better to do things in the normal order. 
Technically, either way should be fine, but if I am missing something please 
let me know and I can change it too.

In TestParallelPut, I did not fold the two loops of thread-creation and 
thread-start. The reason being that I would like more concurrency among the 
threads, and if I create and start in the same loop then it is likely that by 
the a thread starts running, the earlier ones would probably be finished or 
advanced significantly, thus reducing the time when all threads are running 
concurrently.

 The put operation can release the rowlock before sync-ing the Hlog
 --

 Key: HBASE-4528
 URL: https://issues.apache.org/jira/browse/HBASE-4528
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSyncPut1.txt, appendNoSyncPut2.txt


 This allows for better throughput when there are hot rows. A single row 
 update improves from 100 puts/sec/server to 5000 puts/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4487) The increment operation can release the rowlock before sync-ing the Hlog

2011-09-30 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4487:


Release Note: The increment operation releases the rowlock before doing the 
sync to the HLog. This improves performance of increments on hot rows. 

 The increment operation can release the rowlock before sync-ing the Hlog
 

 Key: HBASE-4487
 URL: https://issues.apache.org/jira/browse/HBASE-4487
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync4.txt, appendNoSync5.txt, appendNoSync6.txt


 This allows for better throughput when there are hot rows.I have seen this 
 change make a single row update improve from 400 increments/sec/server to 
 4000 increments/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4487) The increment operation can release the rowlock before sync-ing the Hlog

2011-09-30 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4487:


Release Note: The increment operation releases the rowlock before doing the 
sync to the HLog. This improves performance of increments on hot rows. There is 
a fundamental change to the group-commit behaviour: it batches transactions in 
HBase code before pushing it down to the wal.  (was: The increment operation 
releases the rowlock before doing the sync to the HLog. This improves 
performance of increments on hot rows. )

 The increment operation can release the rowlock before sync-ing the Hlog
 

 Key: HBASE-4487
 URL: https://issues.apache.org/jira/browse/HBASE-4487
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: appendNoSync4.txt, appendNoSync5.txt, appendNoSync6.txt


 This allows for better throughput when there are hot rows.I have seen this 
 change make a single row update improve from 400 increments/sec/server to 
 4000 increments/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4487) The increment operation can release the rowlock before sync-ing the Hlog

2011-09-29 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4487:


Attachment: appendNoSync4.txt

The increment operation releases the rowlock before doing the sync to the HLog. 
This improves performance of increments on hot rows.

Introuced method HLog.appendNoSync() that returns a txid. The increment method 
then release the rowlock and invokes HLog.sync(txid). The HLog.sync(txid) 
returns only if all the transactions upto the one identified by that txid has 
been successfully sycned to HDFS. 

 The increment operation can release the rowlock before sync-ing the Hlog
 

 Key: HBASE-4487
 URL: https://issues.apache.org/jira/browse/HBASE-4487
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSync4.txt


 This allows for better throughput when there are hot rows.I have seen this 
 change make a single row update improve from 400 increments/sec/server to 
 4000 increments/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4487) The increment operation can release the rowlock before sync-ing the Hlog

2011-09-29 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4487:


Attachment: appendNoSync5.txt

Addressed Ted Yu's review comments. The code that does 

{code}
  for (Entry e : pending) {
+writer.append(e);
+  }
{code}

does not catch exceptions, instead throws an exception to the caller if any of 
the edits fail to make it to HDFS. In fact, Hbase regionserver exits if an HDFS 
write/sync fails, this is expected behaviour.

 The increment operation can release the rowlock before sync-ing the Hlog
 

 Key: HBASE-4487
 URL: https://issues.apache.org/jira/browse/HBASE-4487
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSync4.txt, appendNoSync5.txt


 This allows for better throughput when there are hot rows.I have seen this 
 change make a single row update improve from 400 increments/sec/server to 
 4000 increments/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4477) Ability for an application to store metadata into the transaction log

2011-09-29 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4477:


Attachment: coprocessorPut1.txt

Implemented andrew's suggestion of enahncing the prePut, postPut, preDelete and 
postDelete apis to take in the Put/Delete object itself.

In the process of running tests.

 Ability for an application to store metadata into the transaction log
 -

 Key: HBASE-4477
 URL: https://issues.apache.org/jira/browse/HBASE-4477
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: coprocessorPut1.txt, hlogMetadata1.txt


 mySQL allows an application to store an arbitrary blob along with each 
 transaction in its transaction logs. This JIRA is to have a similar feature 
 request for HBASE.
 The use case is as follows: An application on one data center A stores a blob 
 of data along with each transaction. A replication software picks up these 
 blobs from the transaction logs in A and hands it to another instance of the 
 same application running on a remote data center B. The application in B is 
 responsible for applying this to the remote Hbase cluster (and also handle 
 conflict resolution if any).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4487) The increment operation can release the rowlock before sync-ing the Hlog

2011-09-29 Thread dhruba borthakur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-4487:


Attachment: appendNoSync6.txt

All unit tests pass now (expect TestDistributedLogSplitting, 
TestRollingRestart, TestHTablePool), but I am seeing the same test to fail on 
trunk, so these failures do not seem to be related to this patch.

The one reference to System.err.println() is a printUsage() message that is 
needed only if u want to run the unit test as a standalone command line utility.

There is a single test TestIncrement that creates a 100 threads and ensures 
that all the concurrent increments match the final expected result.

There is a benchmark TestHLogBench that measures the performance of the 
appendNoSync call.

 The increment operation can release the rowlock before sync-ing the Hlog
 

 Key: HBASE-4487
 URL: https://issues.apache.org/jira/browse/HBASE-4487
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: appendNoSync4.txt, appendNoSync5.txt, appendNoSync6.txt


 This allows for better throughput when there are hot rows.I have seen this 
 change make a single row update improve from 400 increments/sec/server to 
 4000 increments/sec/server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira