[jira] [Updated] (ACCUMULO-2889) Batch metadata table updates for new walogs

2015-07-09 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated ACCUMULO-2889:
-
Resolution: Incomplete
Status: Resolved  (was: Patch Available)

This needs some more love. With Eric's recent changes to how WALs are recorded 
in the metadata table, it may invalidate the approach this patch was going. 
Feel free to reopen if desired.

> Batch metadata table updates for new walogs
> ---
>
> Key: ACCUMULO-2889
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2889
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Jonathan Park
>Assignee: Jonathan Park
> Attachments: ACCUMULO-2889.0.patch.txt, ACCUMULO-2889.1.patch, 
> ACCUMULO-2889.2.patch, accumulo-2889-withpatch.png, 
> accumulo-2889_withoutpatch.png, batch_perf_test.sh, run_all.sh, 
> start-ingest.sh
>
>
> Currently, when we update the Metadata table with new loggers, we will update 
> the metadata for each tablet serially. We could optimize this to instead use 
> a batchwriter to send all metadata updates for all tablets in a batch.
> A few special cases include:
> - What if the !METADATA tablet was included in the batch?
> - What about the root tablet?
> Benefit:
> In one of our clusters, we're experiencing particularly slow HDFS operations 
> leading to large oscillations in ingest performance. We haven't isolated the 
> cause in HDFS but when we profile the tservers, we noticed that they were 
> waiting for metadata table operations to complete. This would target the 
> waiting.
> Potential downsides:
> Given the existing locking scheme, it looks like we may have to lock a tablet 
> for slightly longer (we'll lock for the duration of the batch).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ACCUMULO-2889) Batch metadata table updates for new walogs

2014-09-01 Thread Jonathan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Park updated ACCUMULO-2889:

Attachment: ACCUMULO-2889.2.patch

I'll gather a new set of #s when I get access to a cluster of machines. 

> Batch metadata table updates for new walogs
> ---
>
> Key: ACCUMULO-2889
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2889
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Jonathan Park
>Assignee: Jonathan Park
> Attachments: ACCUMULO-2889.0.patch.txt, ACCUMULO-2889.1.patch, 
> ACCUMULO-2889.2.patch, accumulo-2889-withpatch.png, 
> accumulo-2889_withoutpatch.png, batch_perf_test.sh, run_all.sh, 
> start-ingest.sh
>
>
> Currently, when we update the Metadata table with new loggers, we will update 
> the metadata for each tablet serially. We could optimize this to instead use 
> a batchwriter to send all metadata updates for all tablets in a batch.
> A few special cases include:
> - What if the !METADATA tablet was included in the batch?
> - What about the root tablet?
> Benefit:
> In one of our clusters, we're experiencing particularly slow HDFS operations 
> leading to large oscillations in ingest performance. We haven't isolated the 
> cause in HDFS but when we profile the tservers, we noticed that they were 
> waiting for metadata table operations to complete. This would target the 
> waiting.
> Potential downsides:
> Given the existing locking scheme, it looks like we may have to lock a tablet 
> for slightly longer (we'll lock for the duration of the batch).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ACCUMULO-2889) Batch metadata table updates for new walogs

2014-06-28 Thread Jonathan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Park updated ACCUMULO-2889:


Attachment: accumulo-2889_withoutpatch.png
accumulo-2889-withpatch.png
ACCUMULO-2889.1.patch
start-ingest.sh
batch_perf_test.sh
run_all.sh

Results from performance tests:

Test design:
- Run continuous ingest with 4 ingesters each ingesting 25million entries and 
then measure time until completion
- We varied # of minor compactors and tablets per server (in retrospect, # of 
minor compactors didn't really matter in these tests, it may have been better 
to vary # of clients).
- Each trial was run 3x and the average was taken.

Tests were run on a single node (24 logical cores, 64 GB RAM, 8 drives)

||minc||tablets/server||w/o patch(ms)||w/ patch(ms)||ratio||
|4|32|269790.33|257537.33|0.95458325|
|12|32|271124.33|255952|0.94403922|
|12|320|355962.67|323737|0.90946896|
|24|32|268709|261362.67|0.97266065|
|24|320|355182.33|324308.67|0.91307659|

I'll try to run this on a multi-node cluster if I can get around to it.

> Batch metadata table updates for new walogs
> ---
>
> Key: ACCUMULO-2889
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2889
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Jonathan Park
>Assignee: Jonathan Park
> Attachments: ACCUMULO-2889.0.patch.txt, ACCUMULO-2889.1.patch, 
> accumulo-2889-withpatch.png, accumulo-2889_withoutpatch.png, 
> batch_perf_test.sh, run_all.sh, start-ingest.sh
>
>
> Currently, when we update the Metadata table with new loggers, we will update 
> the metadata for each tablet serially. We could optimize this to instead use 
> a batchwriter to send all metadata updates for all tablets in a batch.
> A few special cases include:
> - What if the !METADATA tablet was included in the batch?
> - What about the root tablet?
> Benefit:
> In one of our clusters, we're experiencing particularly slow HDFS operations 
> leading to large oscillations in ingest performance. We haven't isolated the 
> cause in HDFS but when we profile the tservers, we noticed that they were 
> waiting for metadata table operations to complete. This would target the 
> waiting.
> Potential downsides:
> Given the existing locking scheme, it looks like we may have to lock a tablet 
> for slightly longer (we'll lock for the duration of the batch).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (ACCUMULO-2889) Batch metadata table updates for new walogs

2014-06-16 Thread Eric Newton (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Newton updated ACCUMULO-2889:
--

Issue Type: Improvement  (was: Bug)

> Batch metadata table updates for new walogs
> ---
>
> Key: ACCUMULO-2889
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2889
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Jonathan Park
>Assignee: Jonathan Park
> Attachments: ACCUMULO-2889.0.patch.txt
>
>
> Currently, when we update the Metadata table with new loggers, we will update 
> the metadata for each tablet serially. We could optimize this to instead use 
> a batchwriter to send all metadata updates for all tablets in a batch.
> A few special cases include:
> - What if the !METADATA tablet was included in the batch?
> - What about the root tablet?
> Benefit:
> In one of our clusters, we're experiencing particularly slow HDFS operations 
> leading to large oscillations in ingest performance. We haven't isolated the 
> cause in HDFS but when we profile the tservers, we noticed that they were 
> waiting for metadata table operations to complete. This would target the 
> waiting.
> Potential downsides:
> Given the existing locking scheme, it looks like we may have to lock a tablet 
> for slightly longer (we'll lock for the duration of the batch).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (ACCUMULO-2889) Batch metadata table updates for new walogs

2014-06-16 Thread Jonathan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Park updated ACCUMULO-2889:


Attachment: ACCUMULO-2889.0.patch.txt

> Batch metadata table updates for new walogs
> ---
>
> Key: ACCUMULO-2889
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2889
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Jonathan Park
>Assignee: Jonathan Park
> Attachments: ACCUMULO-2889.0.patch.txt
>
>
> Currently, when we update the Metadata table with new loggers, we will update 
> the metadata for each tablet serially. We could optimize this to instead use 
> a batchwriter to send all metadata updates for all tablets in a batch.
> A few special cases include:
> - What if the !METADATA tablet was included in the batch?
> - What about the root tablet?
> Benefit:
> In one of our clusters, we're experiencing particularly slow HDFS operations 
> leading to large oscillations in ingest performance. We haven't isolated the 
> cause in HDFS but when we profile the tservers, we noticed that they were 
> waiting for metadata table operations to complete. This would target the 
> waiting.
> Potential downsides:
> Given the existing locking scheme, it looks like we may have to lock a tablet 
> for slightly longer (we'll lock for the duration of the batch).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (ACCUMULO-2889) Batch metadata table updates for new walogs

2014-06-16 Thread Jonathan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Park updated ACCUMULO-2889:


Affects Version/s: 1.6.0
   Status: Patch Available  (was: In Progress)

First pass at batching metadata updates for new WALs. I'll attach a screenshot 
of its affects as well.

> Batch metadata table updates for new walogs
> ---
>
> Key: ACCUMULO-2889
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2889
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.6.0, 1.5.1
>Reporter: Jonathan Park
>Assignee: Jonathan Park
> Attachments: ACCUMULO-2889.0.patch.txt
>
>
> Currently, when we update the Metadata table with new loggers, we will update 
> the metadata for each tablet serially. We could optimize this to instead use 
> a batchwriter to send all metadata updates for all tablets in a batch.
> A few special cases include:
> - What if the !METADATA tablet was included in the batch?
> - What about the root tablet?
> Benefit:
> In one of our clusters, we're experiencing particularly slow HDFS operations 
> leading to large oscillations in ingest performance. We haven't isolated the 
> cause in HDFS but when we profile the tservers, we noticed that they were 
> waiting for metadata table operations to complete. This would target the 
> waiting.
> Potential downsides:
> Given the existing locking scheme, it looks like we may have to lock a tablet 
> for slightly longer (we'll lock for the duration of the batch).



--
This message was sent by Atlassian JIRA
(v6.2#6252)