[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097024#comment-16097024 ] Enis Soztutar commented on HBASE-18086: --- bq. The tool does puts, appends, scans and gets in separate rounds. Thanks. The logic is much more easier to follow now. bq. With shifted region (along with unshifted region), more than one region is involved for the multi-get requests. But it is still 2 regions at a time. Can we please do what I was suggesting above. You can do something like this (assuming 10 threads): thread 0 reads: row_0, row_10, row_20, row_30 thread 1 reads: row_1, row_11, row_21, ... Other than this, looks good. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v17.txt, 18086.v18.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, > 18086.v5.txt, 18086.v6.txt, 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095266#comment-16095266 ] Ted Yu commented on HBASE-18086: bq. Why are we doing Deletes before Append / Increment? Since Append / Increment is not idempotent, the delete calls were intended to make successive runs quicker. Otherwise, truncate_preserve command is involved which takes some time. I can drop this in the next patch. The table is created with the following clause: {code} SPLITS => ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] {code} With shifted region (along with unshifted region), more than one region is involved for the multi-get requests. w.r.t. simplifying verification logic, since integer written thru Increment has unique format (e.g. \x00\x00\x00\x00\x00\x00\x00\x01), I want to see if there is suggestion on how to detect that value of a Cell should be interpreted as integer. We shouldn't rely on the length of value since string can be 8 bytes long as well. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v17.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, > 18086.v6.txt, 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095149#comment-16095149 ] Enis Soztutar commented on HBASE-18086: --- - This method is not used, did you mean to use it?: {code} +std::string Row(const std::string &prefix, uint64_t i) { {code} - We should accumulate the cells for every family, and do 1 put per row, instead of 1 put per-row per-family: {code} +for (auto family : families) { + auto n_cols = Random::rand32(1, cols); + auto put = Put{row}; + put.AddColumn(family, kNumColumn, std::to_string(n_cols)); + for (unsigned int k = 1; k <= n_cols; k++) { +put.AddColumn(family, std::to_string(k), row); + } + table->Put(put); +} {code} - You should not call the methods with passing {{FLAGS_num_rows - 1}}, because total_rows is 1, rather than , etc. Instead total_rows should be sent as it, but if you want to get the width of the max element, then you can send total_rows to the PrefixZero function, and internally inside the function you can do this logic of {{total_rows - 1}}. - Why are we doing Deletes before Append / Increment? Are append and increments going to rows that always have previous Puts? What happens when Delete and Append comes with the same timestamp? - The problem with doing scans and get verification differently is that we are complicating this logic unnecessarily. Can we please make it so that we do a round of multi-gets and a round of Scans. The test should be simple to understand and simple to debug. Having half of the threads doing one thing, and the other half doing the other just complicates the logic for no good reason. Please make it so that both scans and Gets can verify the same set of rows, and uses the same {{VerifyResult()}} like function to verify the data. - Same thing with the Puts / Increments. We would want to have the Puts and Increment / Appends as separate (optional) steps as it is in the LoadTestTool. bq. w.r.t. letting multi-get requests go to different regions, in patch v16 iteration i would issue gets for iteration i+2 (scans and gets are interleaved). I am not sure whether this helps. The rows for the multi-gets still go to the same region, but the regions are shifted now. Am I missing something? > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v17.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, > 18086.v6.txt, 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095141#comment-16095141 ] Devaraj Das commented on HBASE-18086: - bq. Since load-client is supposed to verify large amount of data, I think it makes more sense to adopt this approach instead of issuing scan and get in two rounds of verification. The idea is to put "stress" in the various code path. Let's get the behavior where one can say what to use via an option - scan or get or multi-get. By default it can do scan or multi-get... > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v17.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, > 18086.v6.txt, 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093601#comment-16093601 ] Ted Yu commented on HBASE-18086: Tested patch v17 with v9 from HBASE-18061. With earlier patch from HBASE-18061, the cross region multi-get hung. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v17.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, > 18086.v6.txt, 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093423#comment-16093423 ] Ted Yu commented on HBASE-18086: Currently load-client performs both scan and get for the verification phase. Since load-client is supposed to verify large amount of data, I think it makes more sense to adopt this approach instead of issuing scan and get in two rounds of verification. I choose to keep the naming for these two options: {code} +DEFINE_bool(skip_get, false, "skip get / scan"); +DEFINE_bool(skip_put, false, "skip put's"); {code} In simple-client, the counterparts are named gets / puts. However, I think skip_get is more intuitive - the flag is used to skip some action. The default values for these options convey the same result. w.r.t. PrefixZero(), I don't see impact on performance for the current version. So I kept it. DoGet() may use reversed string for verification conditionally. That is why I prefer separate methods for scan and get. w.r.t. letting multi-get requests go to different regions, in patch v16 iteration i would issue gets for iteration i+2 (scans and gets are interleaved). > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v17.txt, 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, > 18086.v6.txt, 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092345#comment-16092345 ] Ted Yu commented on HBASE-18086: w.r.t. random number generation, after performing more runs, I noticed that the important factor for total duration of write phase is the total number of columns written. I am pulling the random number generation inside the loop per row in the next patch. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, 18086.v6.txt, > 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092265#comment-16092265 ] Enis Soztutar commented on HBASE-18086: --- bq. Updated patch v12 where random number generation is lifted outside the loop (it was observed that write performance suffered with random number generation inside the loop). It does not make sense to me that random number generation is costly. I've looked at the folly code, there is nothing explaining it. Can you please verify the total number of columns written in each case. You can also test with just generating 1M or so random numbers in a loop and measure the total time it takes end to end. We want each row to come with a different number of columns. - No use of {{new}} or {{delete}}. Always use smart pointers. {code} +std::thread *writer_threads = new std::thread[FLAGS_threads]; {code} - These flags should have the same names as the ones in simple-client.cc: {code} +DEFINE_int32(multi_get_size, 1, "number of gets in one multi-get"); +DEFINE_bool(skip_get, false, "skip get / scan"); +DEFINE_bool(skip_put, false, "skip put's"); {code} there is also report_num_rows, scans and multigets and conf flags that you should implement. - These should be return values instead of passing pointer to the methods: {code} bool *succeeded {code} - Instead of executing every Cell as a different Put via Table::Put(), you should construct one Put object, add all the Cells, then call Table::Put() {code} for (uint64_t j = 0; j < rows; j++) { +std::string row = PrefixZero(width, iteration * rows + j); +for (auto family : families) { + table->Put(Put{row}.AddColumn(family, kNumColumn, std::to_string(n_cols))); + for (unsigned int k = 1; k <= n_cols; k++) { +table->Put(Put{row}.AddColumn(family, std::to_string(k), row)); + } +} {code} - Instead of this method: {code} +std::string PrefixZero(int total_width, int num) { {code} you can probably do something like this (from scanner-test.cc): {code} std::string Row(uint32_t i, int width) { std::ostringstream s; s.fill('0'); s.width(width); s << i; return "row" + s.str(); } {code} - Scans and gets should validate the obtained Result using the same logic, no? I think you should extract that into a function and use it from both. - The way we do multi-gets will result in all of the multi-get requests go to the same region. Instead, I think it is better to have the multi-gets scattered around most of the regions, so that we have a high likelihood of testing server failure handling, etc when chaos monkey is run with this. I had argued the same in my above comments. I think we can do something like a hash-like striping across the row key space among threads, rather than range-based striping. That should give us the ability to do multi-gets across all the regions in one {{Table::Get(std::vector)}} call. - We don't have multi-put functionality right now, but when that is added, we should do a follow up patch for this to add multi-put functionality. - These should default to {{load_test_table}} and {{f}} respectively. {code} +DEFINE_string(table, "t", "What table to do the reads and writes with"); +DEFINE_string(families, "d", "comma separated list of column family names"); {code} > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, 18086.v6.txt, > 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091726#comment-16091726 ] Ted Yu commented on HBASE-18086: Running load-client against 1.1 cluster, the client hung at the end of write phase: {code} I0718 15:20:27.652695 9636 load-client.cc:260] joining thread 7 I0718 15:20:27.652781 9636 load-client.cc:262] joined thread 7 I0718 15:20:27.652876 9636 load-client.cc:260] joining thread 8 2017-07-18 15:20:30,545:9636(0x7f9ee50ad700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 13ms 2017-07-18 15:20:43,893:9636(0x7f9ee50ad700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 13ms {code} Attempt to attach gdb to the hanging process encountered: {code} Attaching to process 9636 Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf ptrace: Operation not permitted. {code} Even after modifying /etc/sysctl.d/10-ptrace.conf , I still got the same error. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v14.txt, > 18086.v1.txt, 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, 18086.v6.txt, > 18086.v7.txt, 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090802#comment-16090802 ] Ted Yu commented on HBASE-18086: When using load-client against an hbase 1.1 cluster, sometimes I got: {code} 2017-07-17 23:38:42,333:9228(0x7fd65a5ea700):ZOO_INFO@check_events@1728: initiated connection to server [172.26.109.227:2181] 2017-07-17 23:38:42,351:9228(0x7fd65a5ea700):ZOO_INFO@check_events@1775: session establishment complete on server [172.26.109.227:2181], sessionId=0x15d3dfaa94a07d1, negotiated timeout=4 load-client: io/async/AsyncSocket.cpp:837: virtual void folly::AsyncSocket::closeNow(): Assertion `eventBase_ == nullptr || eventBase_->isInEventBaseThread()' failed. *** Aborted at 1500334722 (unix time) try "date -d @1500334722" if you are using GNU date *** PC: @ 0x7fd66bd68418 gsignal *** SIGABRT (@0x240c) received by PID 9228 (TID 0x7fd6689a9700) from PID 9228; stack trace: *** @ 0x7fd66ce933d0 (unknown) @ 0x7fd66bd68418 gsignal @ 0x7fd66bd6a01a abort @ 0x7fd66bd60bd7 (unknown) @ 0x7fd66bd60c82 __assert_fail @ 0x60d9e4 folly::AsyncSocket::closeNow() @ 0x60abf5 folly::AsyncSocket::destroy() @ 0x50f6a2 std::_Sp_counted_deleter<>::_M_dispose() @ 0x517c91 wangle::AsyncSocketHandler::~AsyncSocketHandler() @ 0x51f090 wangle::ContextImpl<>::~ContextImpl() @ 0x51e505 wangle::PipelineBase::~PipelineBase() @ 0x51e001 wangle::Pipeline<>::~Pipeline() @ 0x50ec1e wangle::ClientBootstrap<>::~ClientBootstrap() @ 0x5072be hbase::ConnectionPool::GetNewConnection() @ 0x50628c hbase::ConnectionPool::GetConnection() @ 0x50c0b7 hbase::RpcClient::GetConnection() @ 0x50c25e hbase::RpcClient::AsyncCall() @ 0x47e23e hbase::RawAsyncTable::Call<>() @ 0x46bf0b std::_Function_handler<>::_M_invoke() @ 0x4ceddc hbase::AsyncSingleRequestRpcRetryingCaller<>::Call() @ 0x4dfc60 _ZZZN5folly6FutureISt10shared_ptrIN5hbase14RegionLocationEEE18thenImplementationIZNS2_35AsyncSingleRequestRpcRetryingCallerINS_4UnitEE14LocateThenCallEvEUlS4_E_NS_6detail14callableResultIS4_SA_EELb0EJOS4_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENSG_6ReturnEE4typeEOT_NSB_9argResultIXT1_ESK_JDpT2_NUlONS_3TryIS4_EEE_clESS_ENKUlvE_clEv @ 0x4dfa2f _ZN5folly11makeTryWithIZZNS_6FutureISt10shared_ptrIN5hbase14RegionLocationEEE18thenImplementationIZNS3_35AsyncSingleRequestRpcRetryingCallerINS_4UnitEE14LocateThenCallEvEUlS5_E_NS_6detail14callableResultIS5_SB_EELb0EJOS5_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENSH_6ReturnEE4typeEOT_NSC_9argResultIXT1_ESL_JDpT2_NUlONS_3TryIS5_EEE_clEST_EUlvE_EENSG_IXsr3std7is_sameINSt9result_ofIFSL_vEE4typeEvEE5valueENSR_IvEEE4typeESM_ @ 0x4df906 _ZN5folly7PromiseINS_4UnitEE7setWithIZZNS_6FutureISt10shared_ptrIN5hbase14RegionLocationEEE18thenImplementationIZNS6_35AsyncSingleRequestRpcRetryingCallerIS1_E14LocateThenCallEvEUlS8_E_NS_6detail14callableResultIS8_SD_EELb0EJOS8_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENSJ_6ReturnEE4typeEOT_NSE_9argResultIXT1_ESN_JDpT2_NUlONS_3TryIS8_EEE_clESV_EUlvE_EEvSO_ @ 0x4df67d _ZN5folly6detail8function18FunctionTypeTraitsIFvONS_3TryISt10shared_ptrIN5hbase14RegionLocationEE13ExecutorMixin13invokeFunctorINS1_9ExecutorsISA_E15FunctorExecutorIZNS_6FutureIS7_E18thenImplementationIZNS5_35AsyncSingleRequestRpcRetryingCallerINS_4UnitEE14LocateThenCallEvEUlS7_E_NS0_14callableResultIS7_SN_EELb0EJOS7_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENSS_6ReturnEE4typeEOT_NS0_9argResultIXT1_ESW_JDpT2_UlS9_E_NS1_25SelectNonConstFunctionTagEvPNSF_10ExecutorIfES9_ @ 0x434e8c folly::detail::Core<>::doCallback() @ 0x45c325 folly::detail::Core<>::setResult() @ 0x451c0b _ZN5folly6detail8function6invokeIRZNS_6FutureISt10shared_ptrIN5hbase14RegionLocationEEE18thenImplementationIZNS5_13LocationCache14LocateFromMetaERKNS5_2pb9TableNameERKSsE3$_7NS0_14callableResultIS7_SH_EELb0EJOS7_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENSM_6ReturnEE4typeEOT_NS0_9argResultIXT1_ESQ_JDpT2_UlONS_3TryIS7_EEE_JSX_EEEDTclclsr3stdE7forwardISQ_Efp_Espclsr3stdE7forwardIT0_Efp0_EEESR_DpOS11_ @ 0x434e8c folly::detail::Core<>::doCallback() @ 0x45c325 folly::detail::Core<>::setResult() @ 0x4509ea _ZN5folly6detail8function6invokeIRZNS_6FutureISt10shared_ptrIN5hbase14RegionLocationEEE18thenImplementationIZNS5_13LocationCache14LocateFromMetaERKNS5_2pb9TableNameERKSsE3$_6NS0_14callableResultIS7_SH_EELb0EJOS7_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENSM_6ReturnEE4typeEOT_NS0_9argResultIXT1_ESQ_JDpT2_UlONS_3TryIS7_EEE_JSX_EEEDTclclsr3stdE7forwardISQ_Efp_Espclsr3stdE7forwa
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090682#comment-16090682 ] Ted Yu commented on HBASE-18086: With random number generation inside the loop: {code} I0714 19:25:31.714277 1042 load-client.cc:257] Successfully sent 1000 Put requests in 152394 ms. {code} outside the loop: {code} I0717 17:13:49.566558 2131 load-client.cc:257] Successfully sent 1000 Put requests in 38833 ms. {code} > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v1.txt, > 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, 18086.v6.txt, 18086.v7.txt, > 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090120#comment-16090120 ] Ted Yu commented on HBASE-18086: Updated patch v12 where random number generation is lifted outside the loop (it was observed that write performance suffered with random number generation inside the loop). > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v11.txt, 18086.v12.txt, 18086.v1.txt, > 18086.v3.txt, 18086.v4.txt, 18086.v5.txt, 18086.v6.txt, 18086.v7.txt, > 18086.v8.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056584#comment-16056584 ] Enis Soztutar commented on HBASE-18086: --- bq. If we don't handle RetriesExhaustedException, how do we perform validation for what is submitted for write(s) ? If the client gets REE, then it should fail the test. We are testing client level retries with the test. If for whatever reason we are getting REE, it is a failure condition. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v1.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056501#comment-16056501 ] Ted Yu commented on HBASE-18086: bq. make sure that what we have written can be read exactly as it is. bq. Do not handle hbase::RetriesExhaustedException's If we don't handle RetriesExhaustedException, how do we perform validation for what is submitted for write(s) ? One option is to keep vector of Cell's which encounter RetriesExhaustedException and skip reading them back. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v1.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056448#comment-16056448 ] Enis Soztutar commented on HBASE-18086: --- One more: - Being able to do just puts, or just gets, etc. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v1.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056445#comment-16056445 ] Enis Soztutar commented on HBASE-18086: --- These are some of the features that we want from this tool. We can start with something in this jira, and do more follow up jiras (for example for scans, etc): - Being able to read / write multiple column families, and multiple columns. We can use the same exact format that LTT uses in the column values (it generates a random number of columns per family). - Being able to test Deletes, Increments and Appends - Being able to test multi-gets. Probably we want the multi-gets to be scattered across the regions when doing the requests. - Being able to test scans (scans are not there in LTT, but we need better end-to-end testing in this area). - Validation of the results that has been written. The goal is not to test the server side, but the client side. But we still need validation to make sure that what we have written can be read exactly as it is. - Do not handle hbase::RetriesExhaustedException's. > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v1.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032047#comment-16032047 ] Ted Yu commented on HBASE-18086: You can specify value for parameter(s) with the following syntax (prefixing with --): --num_rows=123 > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v1.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18086) Create native client which creates load on selected cluster
[ https://issues.apache.org/jira/browse/HBASE-18086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018039#comment-16018039 ] Ted Yu commented on HBASE-18086: Patch v1 allows the specification of the following parameters: table name row prefix zookeeper quorum number of rows number of columns number of threads > Create native client which creates load on selected cluster > --- > > Key: HBASE-18086 > URL: https://issues.apache.org/jira/browse/HBASE-18086 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18086.v1.txt > > > This task is to create a client which uses multiple threads to conduct Puts > followed by Gets against selected cluster. > Default is to run the tool against local cluster. > This would give us some idea on the characteristics of native client in terms > of handling high load. -- This message was sent by Atlassian JIRA (v6.3.15#6346)