[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: results.zip This Zip contains latest test runs. The Difference is for each Node, separate report is created. > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: CassandraCreateLatencyReport.txt, > CassandraDeleteLatencyReport.txt, CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, CassandraLatencyReport_V1.txt, > CassandraUpdateLatencyReport.txt, CUD_Latency_Report-11-09-13.zip, > results.zip, SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: CUD_Latency_Report-11-09-13.zip Hi Here are the latest CUD results which done on pre populated collections > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: CassandraCreateLatencyReport.txt, > CassandraDeleteLatencyReport.txt, CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, CassandraLatencyReport_V1.txt, > CassandraUpdateLatencyReport.txt, CUD_Latency_Report-11-09-13.zip, > SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: CassandraUpdateLatencyReport.txt CassandraDeleteLatencyReport.txt CassandraCreateLatencyReport.txt Here I am atttaching CUD Cassandra Perf data sheets. > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: CassandraCreateLatencyReport.txt, > CassandraDeleteLatencyReport.txt, CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, CassandraLatencyReport_V1.txt, > CassandraUpdateLatencyReport.txt, > SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Ian, On Thu, Aug 29, 2013 at 5:32 PM, Ian Boston wrote: > >> > >> I'll think you are clear to go onto the next phase if you are willing. > >> I think you have an option. Do write or do access control. Write will > >> be more exciting, access control will be more thought intensive. You > >> may already have most of the write code, as you just wrote 100M items! > >> > >> Which one ? > >> > > > > I do have codes which writes to cassandra. But if you are OK, I would > like > > to go with read with access control first. Writing I can easily in > > cooperate to a sling interface once realized the API. I am not sure the > > complexity of access controlling thing :-) . So better to start with it > :-) > > > > > > Thinking about it some more: > Doing access control correctly is non-trivial, so much so we have had > several long discussions on dev@ already. I am conscious of the time > left on GSoC and think it would be better to use what you have learnt > about write to implement the create and update methods on the > CassandraResource. If that really is quick then there will be time to > dive into implementing ACLs. > > I think that will eliminate one of the project steps (read ACLs) and > replace it with read and write ACLs at the end, which means you may be > able to achieve everything you set out to at the start of the project. > > Is that Ok ? > +1 and that really makes sense. Thank you for the heads up. > Ian > -- Thanks /Dishara
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
>> >> I'll think you are clear to go onto the next phase if you are willing. >> I think you have an option. Do write or do access control. Write will >> be more exciting, access control will be more thought intensive. You >> may already have most of the write code, as you just wrote 100M items! >> >> Which one ? >> > > I do have codes which writes to cassandra. But if you are OK, I would like > to go with read with access control first. Writing I can easily in > cooperate to a sling interface once realized the API. I am not sure the > complexity of access controlling thing :-) . So better to start with it :-) > > Thinking about it some more: Doing access control correctly is non-trivial, so much so we have had several long discussions on dev@ already. I am conscious of the time left on GSoC and think it would be better to use what you have learnt about write to implement the create and update methods on the CassandraResource. If that really is quick then there will be time to dive into implementing ACLs. I think that will eliminate one of the project steps (read ACLs) and replace it with read and write ACLs at the end, which means you may be able to achieve everything you set out to at the start of the project. Is that Ok ? Ian
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Ian, On Thu, Aug 29, 2013 at 12:13 PM, Ian Boston wrote: > Hi Dishara, > That was definitely worth doing. Looks like it is flat scalable on > read in this form up to 100M items per collection. Before I get too > excited about the results, are you absolutely certain that the > ResourceProvider is retrieving the items requested and not just short > circuiting somewhere ? > I hope results are correct :-). - In fact currently what we only can retrieve is pre defined cassandra paths in Map. So before running this, I also update the provider to populate 100 children in each node A,B,C,D,E,F. So If I try to obtain a node not already populated in gives me error (I have verified it). So it picks the nodes from the exact places that it suppose to pick. - The latest result I gave was after some server warm up. In the very first iteration, the FIRST RUN average was about 25-26ms(because just after adding 100M records my compute seems over heated and etc). SECOND RUN was OK which had about 12-15ms in avg. Then I ran it again. Then this result came, which is around 12-14ms. That's what I posted. - And also, nodes from /content /cassandra/F/0 .. /content /cassandra/F/. Each node in cassandra evaluates to a different key. So even though we have 100M under same column family, for cassandra, in hector point of view (relational DB point of view), is just a set of records with 10M unique keys and we read one record at a time (but I am not sure how cassandra really store these data). So I believe even with 1B records it will not have a huge latency difference (given that we cluster the cassandra for better performance). > > I'll think you are clear to go onto the next phase if you are willing. > I think you have an option. Do write or do access control. Write will > be more exciting, access control will be more thought intensive. You > may already have most of the write code, as you just wrote 100M items! > > Which one ? > I do have codes which writes to cassandra. But if you are OK, I would like to go with read with access control first. Writing I can easily in cooperate to a sling interface once realized the API. I am not sure the complexity of access controlling thing :-) . So better to start with it :-) > Ian > > On 29 August 2013 02:12, Dishara Wijewardana > wrote: > > Hi Ian, > > I have updated the latest results in the JIRA, and please find the report > > named "CassandraLatencyReport_V1.txt" to get the latest results. I > improve > > the report also to get average latency under each node. So the node "E" > and > > "F" will have 10M and 100M collection. > > > > P.S It took >7 hrs for me to populate a 100M collection :-). Seems it's > > worth spending that much of time for populating a 100M collection by > seeing > > the results. > > > > > === > > == FIRST RUN TEST > > SUMMERY== > > [RESULT] Average Latency Under Node A(1K) = 14 (ms) > > [RESULT] Average Latency Under Node B(10K) = 11 (ms) > > [RESULT] Average Latency Under Node C(100K) = 12 (ms) > > [RESULT] Average Latency Under Node D(1M) = 21 (ms) > > [RESULT] Average Latency Under Node E(10M) = 21 (ms) > > [RESULT] Average Latency Under Node F(100M) = 16 (ms) > > [FIRST RUN] #TOTAL CALLS = 600 Total Average Latency = 16 (ms) > > > === > > == SECOND RUN TEST > > SUMMERY== > > [RESULT] Average Latency Under Node A(1K) = 10 (ms) > > [RESULT] Average Latency Under Node B(10K) = 15 (ms) > > [RESULT] Average Latency Under Node C(100K) = 14 (ms) > > [RESULT] Average Latency Under Node D(1M) = 14 (ms) > > [RESULT] Average Latency Under Node E(10M) = 11 (ms) > > [RESULT] Average Latency Under Node F(100M) = 14 (ms) > > [FIRST RUN] #TOTAL CALLS = 600 Total Average Latency = 13 (ms) > > > === > > > > > > > > > > > > > > > > > > > > On Thu, Aug 29, 2013 at 6:37 AM, Dishara Wijewardana (JIRA) < > j...@apache.org > >> wrote: > > > >> > >> [ > >> > https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > >> > >> Dishara Wijewardana updated SLING-3026: > >> --- > >> > >> Attachment: CassandraLatencyReport_V1.txt > >> > >> Here I am attaching the latest test results which includes latency to > pull > >> data from a 10M and 100M collections. > >> > >> > Cassandra Resource Provider READ Latency Stats > >> > --- > >> > > >> > Key: SLING-30
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Dishara, That was definitely worth doing. Looks like it is flat scalable on read in this form up to 100M items per collection. Before I get too excited about the results, are you absolutely certain that the ResourceProvider is retrieving the items requested and not just short circuiting somewhere ? I'll think you are clear to go onto the next phase if you are willing. I think you have an option. Do write or do access control. Write will be more exciting, access control will be more thought intensive. You may already have most of the write code, as you just wrote 100M items! Which one ? Ian On 29 August 2013 02:12, Dishara Wijewardana wrote: > Hi Ian, > I have updated the latest results in the JIRA, and please find the report > named "CassandraLatencyReport_V1.txt" to get the latest results. I improve > the report also to get average latency under each node. So the node "E" and > "F" will have 10M and 100M collection. > > P.S It took >7 hrs for me to populate a 100M collection :-). Seems it's > worth spending that much of time for populating a 100M collection by seeing > the results. > > === > == FIRST RUN TEST > SUMMERY== > [RESULT] Average Latency Under Node A(1K) = 14 (ms) > [RESULT] Average Latency Under Node B(10K) = 11 (ms) > [RESULT] Average Latency Under Node C(100K) = 12 (ms) > [RESULT] Average Latency Under Node D(1M) = 21 (ms) > [RESULT] Average Latency Under Node E(10M) = 21 (ms) > [RESULT] Average Latency Under Node F(100M) = 16 (ms) > [FIRST RUN] #TOTAL CALLS = 600 Total Average Latency = 16 (ms) > === > == SECOND RUN TEST > SUMMERY== > [RESULT] Average Latency Under Node A(1K) = 10 (ms) > [RESULT] Average Latency Under Node B(10K) = 15 (ms) > [RESULT] Average Latency Under Node C(100K) = 14 (ms) > [RESULT] Average Latency Under Node D(1M) = 14 (ms) > [RESULT] Average Latency Under Node E(10M) = 11 (ms) > [RESULT] Average Latency Under Node F(100M) = 14 (ms) > [FIRST RUN] #TOTAL CALLS = 600 Total Average Latency = 13 (ms) > === > > > > > > > > > > On Thu, Aug 29, 2013 at 6:37 AM, Dishara Wijewardana (JIRA) > wrote: > >> >> [ >> https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] >> >> Dishara Wijewardana updated SLING-3026: >> --- >> >> Attachment: CassandraLatencyReport_V1.txt >> >> Here I am attaching the latest test results which includes latency to pull >> data from a 10M and 100M collections. >> >> > Cassandra Resource Provider READ Latency Stats >> > --- >> > >> > Key: SLING-3026 >> > URL: https://issues.apache.org/jira/browse/SLING-3026 >> > Project: Sling >> > Issue Type: Task >> >Reporter: Dishara Wijewardana >> >Priority: Critical >> > Attachments: CassandraIntegrationTest.patch, >> CassandraLatencyReport.txt, CassandraLatencyReport_V1.txt, >> SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, >> SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, >> SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png >> > >> > >> > This is to keep track on the statistics of the latency for the requests >> done on Cassandra layer through Cassandra Resource Provider. Here we use >> Apache Benchmark. >> > We have a test profile java component in the cassandra module to add >> bulk test data to cassandra. >> > /content/cassandra/A/0 to /content/cassandra/A/999 >> > /content/cassandra/B/0 to /content/cassandra/B/ >> > /content/cassandra/C/0 to /content/cassandra/C/9 >> > /content/cassandra/D/0 to /content/cassandra/D/99 >> > And then this JIRA will keep track of reports on the http request time >> to retrieve 1 node from each following data collection. >> > >> >> -- >> This message is automatically generated by JIRA. >> If you think it was sent incorrectly, please contact your JIRA >> administrators >> For more information on JIRA, see: http://www.atlassian.com/software/jira >> > > > > -- > Thanks > /Dishara
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Ian, I have updated the latest results in the JIRA, and please find the report named "CassandraLatencyReport_V1.txt" to get the latest results. I improve the report also to get average latency under each node. So the node "E" and "F" will have 10M and 100M collection. P.S It took >7 hrs for me to populate a 100M collection :-). Seems it's worth spending that much of time for populating a 100M collection by seeing the results. === == FIRST RUN TEST SUMMERY== [RESULT] Average Latency Under Node A(1K) = 14 (ms) [RESULT] Average Latency Under Node B(10K) = 11 (ms) [RESULT] Average Latency Under Node C(100K) = 12 (ms) [RESULT] Average Latency Under Node D(1M) = 21 (ms) [RESULT] Average Latency Under Node E(10M) = 21 (ms) [RESULT] Average Latency Under Node F(100M) = 16 (ms) [FIRST RUN] #TOTAL CALLS = 600 Total Average Latency = 16 (ms) === == SECOND RUN TEST SUMMERY== [RESULT] Average Latency Under Node A(1K) = 10 (ms) [RESULT] Average Latency Under Node B(10K) = 15 (ms) [RESULT] Average Latency Under Node C(100K) = 14 (ms) [RESULT] Average Latency Under Node D(1M) = 14 (ms) [RESULT] Average Latency Under Node E(10M) = 11 (ms) [RESULT] Average Latency Under Node F(100M) = 14 (ms) [FIRST RUN] #TOTAL CALLS = 600 Total Average Latency = 13 (ms) === On Thu, Aug 29, 2013 at 6:37 AM, Dishara Wijewardana (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] > > Dishara Wijewardana updated SLING-3026: > --- > > Attachment: CassandraLatencyReport_V1.txt > > Here I am attaching the latest test results which includes latency to pull > data from a 10M and 100M collections. > > > Cassandra Resource Provider READ Latency Stats > > --- > > > > Key: SLING-3026 > > URL: https://issues.apache.org/jira/browse/SLING-3026 > > Project: Sling > > Issue Type: Task > >Reporter: Dishara Wijewardana > >Priority: Critical > > Attachments: CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, CassandraLatencyReport_V1.txt, > SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > > > > This is to keep track on the statistics of the latency for the requests > done on Cassandra layer through Cassandra Resource Provider. Here we use > Apache Benchmark. > > We have a test profile java component in the cassandra module to add > bulk test data to cassandra. > > /content/cassandra/A/0 to /content/cassandra/A/999 > > /content/cassandra/B/0 to /content/cassandra/B/ > > /content/cassandra/C/0 to /content/cassandra/C/9 > > /content/cassandra/D/0 to /content/cassandra/D/99 > > And then this JIRA will keep track of reports on the http request time > to retrieve 1 node from each following data collection. > > > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira > -- Thanks /Dishara
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: CassandraLatencyReport_V1.txt Here I am attaching the latest test results which includes latency to pull data from a 10M and 100M collections. > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, CassandraLatencyReport_V1.txt, > SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Dishara, Thats excellent. More evidence that read scalability is flat as the number of child entries rises. All the results are distributed around 14ms regardless of the number of child nodes upto 1M. How much effort would it be to populate a collection with 10M and 100M items ? Ian On 27 August 2013 06:44, Dishara Wijewardana wrote: > Hi Ian, FYI > The exercise data is the first 100 even numbers and test data is first 100 > odd numbers. > > > On Tue, Aug 27, 2013 at 11:12 AM, Dishara Wijewardana < > ddwijeward...@gmail.com> wrote: > >> Hi Ian, >> I have updated the JIRA https://issues.apache.org/jira/browse/SLING-3026 with >> the new test results. I have created a integration test which runs inside >> /launchpad/integration-tests which does the exact same thing you mentioned. >> And I am writing the results to a file and that is also attached in the >> JIRA. It also shows you the test summary with average latency. >> >> NOTE: Here I use HTTPBase test to do HTTP calls and I calculate the >> latency from the time difference in millis between before call and after >> call. >> >> >> >> On Sat, Aug 24, 2013 at 1:06 PM, Ian Boston wrote: >> >>> Hi Dishara, >>> >>> Interesting, >>> Read times show no correlation the number of items in a collection. >>> (thats good!). >>> From 1 - 1M child nodes the access time is almost identical showing >>> flat scalability for read as collection size grows. >>> >>> Since the results are so good, I think it would be worth expanding the >>> test to verify that it really is the case. >>> >>> Rather than starting a fresh server, can you randomise which node is >>> retrieved, retrieve the node only once and run against a server that >>> has been previously exercised on different nodes. >>> >>> The test algorithm should go something like this. >>> >>> populate a set with 100 unique numbers in the range 0-1000 (call this >>> exercise set) >>> populate a set with 100 unique numbers in the range 0-1000 not in the >>> first set ( call this test set). >>> for each collection (A,B,C,D): >>> get all the children in exercise set. >>> record the time taken to get each child in test set. (first >>> time results) >>> get all the children in exercise set. >>> record the time taken to get each child in test set. (second >>> time results) >>> >>> This may not be a perfect test but it tries to bring the server up >>> into a running state, eliminate first time startups and measure the >>> time taken to get an child first and second time. If that still shows >>> a completely flat scaling curve from 0 to 1M items, then that becomes >>> really interesting. >>> >>> Ian >>> >>> >>> On 23 August 2013 03:33, Dishara Wijewardana (JIRA) >>> wrote: >>> > >>> > [ >>> https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] >>> > >>> > Dishara Wijewardana updated SLING-3026: >>> > --- >>> > >>> > Attachment: SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png >>> > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png >>> > >>> > The corresponding graphs attach herewith. >>> > >>> >> Cassandra Resource Provider READ Latency Stats >>> >> --- >>> >> >>> >> Key: SLING-3026 >>> >> URL: https://issues.apache.org/jira/browse/SLING-3026 >>> >> Project: Sling >>> >> Issue Type: Task >>> >>Reporter: Dishara Wijewardana >>> >>Priority: Critical >>> >> Attachments: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, >>> SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, >>> SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png >>> >> >>> >> >>> >> This is to keep track on the statistics of the latency for the >>> requests done on Cassandra layer through Cassandra Resource Provider. Here >>> we use Apache Benchmark. >>> >> We have a test profile java component in the cassandra module to add >>> bulk test data to cassandra. >>> >> /content/cassandra/A/0 to /content/cassandra/A/999 >>> >> /content/cassandra/B/0 to /content/cassandra/B/ >>> >> /content/cassandra/C/0 to /content/cassandra/C/9 >>> >> /content/cassandra/D/0 to /content/cassandra/D/99 >>> >> And then this JIRA will keep track of reports on the http request time >>> to retrieve 1 node from each following data collection. >>> >> >>> > >>> > -- >>> > This message is automatically generated by JIRA. >>> > If you think it was sent incorrectly, please contact your JIRA >>> administrators >>> > For more information on JIRA, see: >>> http://www.atlassian.com/software/jira >>> >> >> >> >> -- >> Thanks >> /Dishara >> > > > > -- > Thanks > /Dishara
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Ian, FYI The exercise data is the first 100 even numbers and test data is first 100 odd numbers. On Tue, Aug 27, 2013 at 11:12 AM, Dishara Wijewardana < ddwijeward...@gmail.com> wrote: > Hi Ian, > I have updated the JIRA https://issues.apache.org/jira/browse/SLING-3026 with > the new test results. I have created a integration test which runs inside > /launchpad/integration-tests which does the exact same thing you mentioned. > And I am writing the results to a file and that is also attached in the > JIRA. It also shows you the test summary with average latency. > > NOTE: Here I use HTTPBase test to do HTTP calls and I calculate the > latency from the time difference in millis between before call and after > call. > > > > On Sat, Aug 24, 2013 at 1:06 PM, Ian Boston wrote: > >> Hi Dishara, >> >> Interesting, >> Read times show no correlation the number of items in a collection. >> (thats good!). >> From 1 - 1M child nodes the access time is almost identical showing >> flat scalability for read as collection size grows. >> >> Since the results are so good, I think it would be worth expanding the >> test to verify that it really is the case. >> >> Rather than starting a fresh server, can you randomise which node is >> retrieved, retrieve the node only once and run against a server that >> has been previously exercised on different nodes. >> >> The test algorithm should go something like this. >> >> populate a set with 100 unique numbers in the range 0-1000 (call this >> exercise set) >> populate a set with 100 unique numbers in the range 0-1000 not in the >> first set ( call this test set). >> for each collection (A,B,C,D): >> get all the children in exercise set. >> record the time taken to get each child in test set. (first >> time results) >> get all the children in exercise set. >> record the time taken to get each child in test set. (second >> time results) >> >> This may not be a perfect test but it tries to bring the server up >> into a running state, eliminate first time startups and measure the >> time taken to get an child first and second time. If that still shows >> a completely flat scaling curve from 0 to 1M items, then that becomes >> really interesting. >> >> Ian >> >> >> On 23 August 2013 03:33, Dishara Wijewardana (JIRA) >> wrote: >> > >> > [ >> https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] >> > >> > Dishara Wijewardana updated SLING-3026: >> > --- >> > >> > Attachment: SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png >> > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png >> > >> > The corresponding graphs attach herewith. >> > >> >> Cassandra Resource Provider READ Latency Stats >> >> --- >> >> >> >> Key: SLING-3026 >> >> URL: https://issues.apache.org/jira/browse/SLING-3026 >> >> Project: Sling >> >> Issue Type: Task >> >>Reporter: Dishara Wijewardana >> >>Priority: Critical >> >> Attachments: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, >> SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, >> SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png >> >> >> >> >> >> This is to keep track on the statistics of the latency for the >> requests done on Cassandra layer through Cassandra Resource Provider. Here >> we use Apache Benchmark. >> >> We have a test profile java component in the cassandra module to add >> bulk test data to cassandra. >> >> /content/cassandra/A/0 to /content/cassandra/A/999 >> >> /content/cassandra/B/0 to /content/cassandra/B/ >> >> /content/cassandra/C/0 to /content/cassandra/C/9 >> >> /content/cassandra/D/0 to /content/cassandra/D/99 >> >> And then this JIRA will keep track of reports on the http request time >> to retrieve 1 node from each following data collection. >> >> >> > >> > -- >> > This message is automatically generated by JIRA. >> > If you think it was sent incorrectly, please contact your JIRA >> administrators >> > For more information on JIRA, see: >> http://www.atlassian.com/software/jira >> > > > > -- > Thanks > /Dishara > -- Thanks /Dishara
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Ian, I have updated the JIRA https://issues.apache.org/jira/browse/SLING-3026 with the new test results. I have created a integration test which runs inside /launchpad/integration-tests which does the exact same thing you mentioned. And I am writing the results to a file and that is also attached in the JIRA. It also shows you the test summary with average latency. NOTE: Here I use HTTPBase test to do HTTP calls and I calculate the latency from the time difference in millis between before call and after call. On Sat, Aug 24, 2013 at 1:06 PM, Ian Boston wrote: > Hi Dishara, > > Interesting, > Read times show no correlation the number of items in a collection. > (thats good!). > From 1 - 1M child nodes the access time is almost identical showing > flat scalability for read as collection size grows. > > Since the results are so good, I think it would be worth expanding the > test to verify that it really is the case. > > Rather than starting a fresh server, can you randomise which node is > retrieved, retrieve the node only once and run against a server that > has been previously exercised on different nodes. > > The test algorithm should go something like this. > > populate a set with 100 unique numbers in the range 0-1000 (call this > exercise set) > populate a set with 100 unique numbers in the range 0-1000 not in the > first set ( call this test set). > for each collection (A,B,C,D): > get all the children in exercise set. > record the time taken to get each child in test set. (first > time results) > get all the children in exercise set. > record the time taken to get each child in test set. (second > time results) > > This may not be a perfect test but it tries to bring the server up > into a running state, eliminate first time startups and measure the > time taken to get an child first and second time. If that still shows > a completely flat scaling curve from 0 to 1M items, then that becomes > really interesting. > > Ian > > > On 23 August 2013 03:33, Dishara Wijewardana (JIRA) > wrote: > > > > [ > https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] > > > > Dishara Wijewardana updated SLING-3026: > > --- > > > > Attachment: SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png > > > > The corresponding graphs attach herewith. > > > >> Cassandra Resource Provider READ Latency Stats > >> --- > >> > >> Key: SLING-3026 > >> URL: https://issues.apache.org/jira/browse/SLING-3026 > >> Project: Sling > >> Issue Type: Task > >>Reporter: Dishara Wijewardana > >>Priority: Critical > >> Attachments: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > >> > >> > >> This is to keep track on the statistics of the latency for the requests > done on Cassandra layer through Cassandra Resource Provider. Here we use > Apache Benchmark. > >> We have a test profile java component in the cassandra module to add > bulk test data to cassandra. > >> /content/cassandra/A/0 to /content/cassandra/A/999 > >> /content/cassandra/B/0 to /content/cassandra/B/ > >> /content/cassandra/C/0 to /content/cassandra/C/9 > >> /content/cassandra/D/0 to /content/cassandra/D/99 > >> And then this JIRA will keep track of reports on the http request time > to retrieve 1 node from each following data collection. > >> > > > > -- > > This message is automatically generated by JIRA. > > If you think it was sent incorrectly, please contact your JIRA > administrators > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > -- Thanks /Dishara
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: CassandraLatencyReport.txt Please find the attached Latency report with latest model. > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: CassandraIntegrationTest.patch Here I am attaching the patch which has the integration test for Cassandra Load Test > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: CassandraIntegrationTest.patch, > CassandraLatencyReport.txt, SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
Hi Dishara, Interesting, Read times show no correlation the number of items in a collection. (thats good!). >From 1 - 1M child nodes the access time is almost identical showing flat scalability for read as collection size grows. Since the results are so good, I think it would be worth expanding the test to verify that it really is the case. Rather than starting a fresh server, can you randomise which node is retrieved, retrieve the node only once and run against a server that has been previously exercised on different nodes. The test algorithm should go something like this. populate a set with 100 unique numbers in the range 0-1000 (call this exercise set) populate a set with 100 unique numbers in the range 0-1000 not in the first set ( call this test set). for each collection (A,B,C,D): get all the children in exercise set. record the time taken to get each child in test set. (first time results) get all the children in exercise set. record the time taken to get each child in test set. (second time results) This may not be a perfect test but it tries to bring the server up into a running state, eliminate first time startups and measure the time taken to get an child first and second time. If that still shows a completely flat scaling curve from 0 to 1M items, then that becomes really interesting. Ian On 23 August 2013 03:33, Dishara Wijewardana (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Dishara Wijewardana updated SLING-3026: > --- > > Attachment: SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png > > The corresponding graphs attach herewith. > >> Cassandra Resource Provider READ Latency Stats >> --- >> >> Key: SLING-3026 >> URL: https://issues.apache.org/jira/browse/SLING-3026 >> Project: Sling >> Issue Type: Task >>Reporter: Dishara Wijewardana >>Priority: Critical >> Attachments: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, >> SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, >> SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png >> >> >> This is to keep track on the statistics of the latency for the requests done >> on Cassandra layer through Cassandra Resource Provider. Here we use Apache >> Benchmark. >> We have a test profile java component in the cassandra module to add bulk >> test data to cassandra. >> /content/cassandra/A/0 to /content/cassandra/A/999 >> /content/cassandra/B/0 to /content/cassandra/B/ >> /content/cassandra/C/0 to /content/cassandra/C/9 >> /content/cassandra/D/0 to /content/cassandra/D/99 >> And then this JIRA will keep track of reports on the http request time to >> retrieve 1 node from each following data collection. >> > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png The corresponding graphs attach herewith. > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt, > SLING_CASSANDRA_LATENCY_STATS_CHART_22-08-2013.png, > SLING_CASSANDRA_LATENCY_STATS_TWO_CHART_22-08-2013.png > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (SLING-3026) Cassandra Resource Provider READ Latency Stats
[ https://issues.apache.org/jira/browse/SLING-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishara Wijewardana updated SLING-3026: --- Attachment: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt Here I have attached the first round of report. > Cassandra Resource Provider READ Latency Stats > --- > > Key: SLING-3026 > URL: https://issues.apache.org/jira/browse/SLING-3026 > Project: Sling > Issue Type: Task >Reporter: Dishara Wijewardana >Priority: Critical > Attachments: SLING_CASSANDRA_LATENCY_STATS_22-08-2013.txt > > > This is to keep track on the statistics of the latency for the requests done > on Cassandra layer through Cassandra Resource Provider. Here we use Apache > Benchmark. > We have a test profile java component in the cassandra module to add bulk > test data to cassandra. > /content/cassandra/A/0 to /content/cassandra/A/999 > /content/cassandra/B/0 to /content/cassandra/B/ > /content/cassandra/C/0 to /content/cassandra/C/9 > /content/cassandra/D/0 to /content/cassandra/D/99 > And then this JIRA will keep track of reports on the http request time to > retrieve 1 node from each following data collection. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira