Hi John,

What's the fix (committed on Jan11) you mentioned ? Is it in opentsdb/hbase
? Do you have a JIRA #.. ?

Thanks

On Wed, Feb 25, 2015 at 5:49 AM, brady2 [via Apache HBase] <
ml-node+s679495n4068627...@n3.nabble.com> wrote:

> Hi Sathya and Nick,
>
> Here are the stack traces of the region server dumps when the huge .tmp
> files are created:
>
> https://drive.google.com/open?id=0B1tQg4D17jKQNDdFZkFQTlg4ZjQ&authuser=0
>
> As background we are not using compression. Compaction is occurs every
> hour. Everything else is default.
>
> OpenTSDB v2.0 is running on top of Cloudera 5.3.1 in AWS. We have a 7 node
> Cloudera cluster(each node with 32GB ram and 3TB disk space), with 5
> OpenTSDB instances dedicated for writing and 2 for reading. We are using
> AWS ELB’s in front of OpenTSDB to balance the read/writes.
>
> We are load testing OpenTSDB using SOCKETS, but running into several
> issues. Let me explain first how we do this load testing:
>
> 1.From another AWS system, we have written a testing framework to generate
> load.
>
> 2. The framework takes several parameters, we can specify the number of
> threads, the loop size (i.e. the number of sockets that each thread will
> open) and the batch size (i.e. the number of PUT’s, or inserts, that each
> socket connection will handle).
>
> 3. To simplify troubleshooting, we removed variables from the tests, we
> have just 1 OpenTSDB instance behind the AWS ELB so the load is being sent
> to 1 instance only.
>
> 4. We are initially creating the openTSDB tables without any pre-splitting
> of regions.
>
> 5. We are doing the loading with 1 metric only for ease of querying in the
> UI.
>
> 6. We are sending under 5000 inserts per second:
>
> 7. At the top of the hour, the row compaction kicks in and the region
> server is too busy so we lose data. it recovers the first time. But the 2nd
> hour, there is so much data presumably, that it doesn’t recover. To fix it,
> we have to restart cloudera, reboot the nodes, drop the tsdb tables and
> re-create them. Otherwise the .tmp file keeps growing until it fills the
> 3TB disks and the system is unresponsive.
>
> 8. We see problems with region splits happening under heavy load. We noted
> a code fix committed on Jan 11 for this but I presume that is not in RC2.1.
>
> Thanks
>
>
>
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068627.html
>  To unsubscribe from HBase with opentsdb creates huge .tmp file & runs out
> of hdfs space, click here
> <http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4067577&code=c2F0aHlhZm10QGdtYWlsLmNvbXw0MDY3NTc3fDUxNzU0MjkyMA==>
> .
> NAML
> <http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068650.html
Sent from the HBase User mailing list archive at Nabble.com.

Reply via email to