I'm still kinda new to HBase so please excuse me if I am wrong. I suspect
the reason has to do with a different slide from their presentation where
they run a job every hour to combine all the cells from the previous hour
into one cell.
OpenTSDB has quite a long row key. It contains the metric
Sorry, accidentally hit send. I'm guessing a 10 minute time slice would
drop their space savings from 4-8x down to 2-4x.
On Aug 27, 2013 11:30 PM, Chris Perluss tradersan...@gmail.com wrote:
I'm still kinda new to HBase so please excuse me if I am wrong. I suspect
the reason has to do with a
It might help to pick a granularity level. For example let's suppose you
pick a granularity level of 0.1.
Any piece of the song you receive should be broken down into segments of
0.1 and they need to be aligned on 0.1.
Example: you receive a piece of the song from 0.65 to 0.85.
You would break
Hey Ravi,
Seems I find what problem was: when I communicate with stargate I not set
Accept header to application/json. It was octet-stream and according to
documentation it can only give one value.
Thanks.
On Wed, Aug 28, 2013 at 8:46 AM, Dmitriy Troyan troyan.dmit...@gmail.comwrote:
Please
Hi Chris,
Thanks a lot for the detailed response. I'll definitely try this design and
see how it performs.
Anand
On 28 August 2013 13:56, Chris Perluss tradersan...@gmail.com wrote:
It might help to pick a granularity level. For example let's suppose you
pick a granularity level of 0.1.
Hi all,
I know what we can go over to the HBase UI and make a split on our table so
that it will be distributed over the cluster.. Is there a way to know it
via an API and to possibly change it? This is to know how many map tasks
run on our table before we actually run the MR job..
--
Regards-
To check how regions you have in a table (and possibly what they are)
HBaseAdmin#getTableRegionshttp://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#getTableRegions(byte[]).
In order to split the table you can use
Hi,
We are running a stress test in our 5 node cluster and we are getting the
expected mean latency of 10ms. But we are seeing around 20 reads out of 25
million reads having latency more than 4 seconds. Can anyone provide the
insight what we can do to meet below second SLA for each and every
How do i get the Server Name associated with the region?
On Wed, Aug 28, 2013 at 3:46 PM, Ashwanth Kumar
ashwanthku...@googlemail.com wrote:
To check how regions you have in a table (and possibly what they are)
HBaseAdmin#getTableRegions
Listhttp://docs.oracle.com/javase/6/docs/api/java/util/List.html?is-external=true
HRegionInfohttp://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html
= HBaseAdmin#getTableRegions(tablename);
HRegionInfo.getServerName();
Regards,
Surendra M
-- Surendra Manchikanti
On Wed, Aug
Hi,
According the error message in the master log, there maybe some inconsistencies
in the configuration, check the configuration on all nodes, if the properity
below is configured, and if it's inconsistent.
property
namehbase.metrics.showTableName/name
valuetrue/value
Taking what Ravi Kiran mentioned a level higher, you can also use Pig. It
has DBStorage. Very easy to rad from HBase and dump to MySQL if your data
porting does not require complex transformation (even which can be handled
in Pig too.)
HI All,
We have a very heavy map reduce job that goes over entire table with over
1TB+ data in HBase and exports all data (Similar to Export job but with
some additional custom code built in) to HDFS.
However this job is not very stable, and often times we get following error
and job fails:
Hi,
I am using HBase 0.94.11 with Hadoop 1.1.2
I want to improve my current monitoring solution and I create a custom
MetricsSink that export metrics in a custom format. This solution runs
perfect with Hadoop.
Unfortunately, I cannot say the same thing about HBase.
I have several questions:
1.
Couple of things:
- Can you check the resources on the region server for which you get the lease
exception? It seems like the server is heavily thrashed
- What are your values for scan.setCaching and scan.setBatch?
The lease does not exist exception generally happens when the client goes back
From the log you posted on pastebin, I see the following.
Can you check namenode log to see what went wrong ?
1. Caused by:
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
You can also use HTable#getRegionLocations():
public NavigableMapHRegionInfo, ServerName getRegionLocations()
throwsIOException {
FYI
On Wed, Aug 28, 2013 at 6:12 AM, Surendra , Manchikanti
surendra.manchika...@gmail.com wrote:
List
Thanks for your response.
I checked namenode logs and I find following:
2013-08-28 15:25:24,025 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: recover
lease [Lease. Holder:
DFSClient_hb_rs_smartdeals-hbase14-snc1.snc1,60020,1377700014053_-346895658_25,
pendingcreates:
Hi,
I have data in form:
source, destination, connection
This data is saved in hdfs
I want to read this data and put it in hbase table something like:
Column1 (source) | Column2(Destination)| Column3(Connection Type)
Rowvertex A| vertex B | connection
MapReduce job reading in your data in HDFS and then emitting Puts against
the target table in the Mapper since it looks like there isn't any
transform happening...
http://hbase.apache.org/book/mapreduce.example.html
Likewise, what Harsh said a few days ago.
On 8/27/13 6:33 PM, Harsh J
Hbase comes with Bulkload tool. Please check below link.
http://hbase.apache.org/book/arch.bulk.load.html
Regards,
Surendra M
-- Surendra Manchikanti
On Wed, Aug 28, 2013 at 11:39 PM, Doug Meil
doug.m...@explorysmedical.comwrote:
MapReduce job reading in your data in HDFS and then
cf in this example is a column family, and this needs to exist in the
tables (both input and output) before the job is submitted.
On 8/26/13 3:01 PM, jamal sasha jamalsha...@gmail.com wrote:
Hi,
I am new to hbase, so few noob questions.
So, I created a table in hbase:
A quick scan gives
Hi to everybody,
I have two questions:
- My HBase table is composed by a UUID as a key and xml as content in a
single column.
Which is at the moment the best option to read all those xml, deserialize
to their object representation and add them to Solr (or another indexing
system)?
The problem
I keep getting these error message when I run multiple clients. For a single
client, the same table/query gets done in 400 msec. But for 60 clients it jumps
to 10 secs (1msec). Any ideas on where the bottle neck could be ? Or how to
go about debugging this.
Regards,
- kiru
Kiru
1. 4 sec max latency is not that bad taking into account 12GB heap. It can be
much larger. What is your SLA?
2. Block evictions is the result of a poor cache hit rate and the root cause of
a periodic stop-the-world GC pauses (max latencies
latencies you have been observing in the test)
3.
Just ignore last part: 'If you don have in_memory column families you may
decrease'
Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodio...@carrieriq.com
From: Vladimir Rodionov
Sent: Wednesday, August
Can you post error message ?
Thanks
On Wed, Aug 28, 2013 at 12:00 PM, Kiru Pakkirisamy
kirupakkiris...@yahoo.com wrote:
I keep getting these error message when I run multiple clients. For a
single client, the same table/query gets done in 400 msec. But for 60
clients it jumps to 10 secs
Hi All,
I have been looking at the parseColumn method in the KeyValue class of
HBase. The javadoc of that method
does not recommend that method be used. I wanted to know if there is any other
existing API which can be used in
place of the above method.
Thanks
Vandana
Right 4 sec is good.
@Saurabh - so your read is - getting 20 out of 25 millions rows ?. Is this a
Get or a Scan ?
BTW, in this stress test how many concurrent clients do you have ?
Regards,
- kiru
From: Vladimir Rodionov vrodio...@carrieriq.com
To:
Ted,
There are no error message. Only a WARN. It lists all the arguments to the
call.
It seems like a resource configuration issue. I am unable to get past 60
concurrent clients or so.
And I did set the rpc handler count to 400.
(I just deleted the log file so I have to redo it again, it takes
On Wed, Aug 28, 2013 at 6:55 AM, Ionut Ignatescu
ionut.ignate...@gmail.com wrote:
MetricsSink that export metrics in a custom format. This solution runs
Metrics2 support in HBase will be released in 0.96. 0.94.x versions
of HBase still use the older metrics system.
Hi
About the internals of locking a row in hbase.
Does hbase row locks map one-to-one with a locks in zookeeper or are there
any optimizations based on the fact that a row only exist on a single
machine?
Cheers,
-Kristoffer
Hi Vlad,
Thanks for your response.
1. Our SLA is less than one sec. we cannot afford latency more than 1 sec.
We can increase heap size if that help, we have enough memory on server. What
would be the optimal heap size?
2. Cache hit ratio is 95%. One thing I don't understand that we have
RowLock API has been removed in 0.96.
Can you tell us your use case ?
On Wed, Aug 28, 2013 at 3:14 PM, Kristoffer Sjögren sto...@gmail.comwrote:
Hi
About the internals of locking a row in hbase.
Does hbase row locks map one-to-one with a locks in zookeeper or are there
any optimizations
Thanks Kitu. We need less than 1 sec latency.
We are using both muliGet and get.
We have three concurrent clients running 10 threads each. ( that makes total 30
concurrent clients).
Thanks,
Saurabh.
On Aug 28, 2013, at 4:30 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote:
Right 4
Worst case you can use ZK to do the same if you only need that from time to
time?
Le 2013-08-28 18:19, Ted Yu yuzhih...@gmail.com a écrit :
RowLock API has been removed in 0.96.
Can you tell us your use case ?
On Wed, Aug 28, 2013 at 3:14 PM, Kristoffer Sjögren sto...@gmail.com
wrote:
I want a distributed lock condition for doing certain operations that may
or may not be unrelated to hbase.
On Thu, Aug 29, 2013 at 12:18 AM, Ted Yu yuzhih...@gmail.com wrote:
RowLock API has been removed in 0.96.
Can you tell us your use case ?
On Wed, Aug 28, 2013 at 3:14 PM, Kristoffer
On Wed, Aug 28, 2013 at 12:46 PM, Vandana Ayyalasomayajula
avand...@yahoo-inc.com wrote:
Hi All,
I have been looking at the parseColumn method in the KeyValue class of
HBase. The javadoc of that method
does not recommend that method be used. I wanted to know if there is any
other existing
Saurabh, we are able to 600K rowxcolumns in 400 msec. We have put what was a
40million row table as 400K rows and columns. We Get about 100 of the rows from
this 400K , do quite a bit of calculations in the coprocessor (almost a
group-order by) and return in this time.
Maybe should consider
Thanks Stack for following up.
On Aug 28, 2013, at 3:31 PM, Stack wrote:
On Wed, Aug 28, 2013 at 12:46 PM, Vandana Ayyalasomayajula
avand...@yahoo-inc.com wrote:
Hi All,
I have been looking at the parseColumn method in the KeyValue class of
HBase. The javadoc of that method
does not
Increasing Java heap size will make latency worse, actually.
You can't guarantee 1 sec max latency if run Java app (unless your heap size is
much less than 1GB).
I have never heard about strict maximum latency limit. Usually , its 99% , 99.9
or 99.99% query percentiles.
You can greatly reduce
Any ideas? Anyone?
On Wed, Aug 28, 2013 at 9:36 AM, Ameya Kanitkar am...@groupon.com wrote:
Thanks for your response.
I checked namenode logs and I find following:
2013-08-28 15:25:24,025 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: recover
lease [Lease.
Oh , so they have them packed into one cell . If so, now its reasonable
that they claim it speed up row seeking .
thanks a lot.
2013/8/28 Chris Perluss tradersan...@gmail.com
Sorry, accidentally hit send. I'm guessing a 10 minute time slice would
drop their space savings from 4-8x down to
You could add an isLocked column to your row. When you want to lock or
update your row then use checkAndPut and check that isLocked=0. When
unlocking your row then checkAndPut that isLocked=1. You will have
effectively locked the row for the purposes of your application without
affecting HBase
Ted,
Can you clarify...
Do you mean the API is no longer a public API, or do you mean no more RLL for
atomic writes?
On Aug 28, 2013, at 5:18 PM, Ted Yu yuzhih...@gmail.com wrote:
RowLock API has been removed in 0.96.
Can you tell us your use case ?
On Wed, Aug 28, 2013 at 3:14 PM,
The API is no longer a public API
Thanks
On Wed, Aug 28, 2013 at 7:58 PM, Michael Segel michael_se...@hotmail.comwrote:
Ted,
Can you clarify...
Do you mean the API is no longer a public API, or do you mean no more RLL
for atomic writes?
On Aug 28, 2013, at 5:18 PM, Ted Yu
Specifically the API has been removed because it had never actually worked
correctly.
Rowlocks are used by RegionServers for intra-region operations.
As such they are ephemeral, in-memory constructs, that cannot reliably outlive
a single RPC request.
The HTable rowlock API allowed you to
A 1s SLA is tough in HBase (or any large memory JVM application).
Maybe, if you presplit your table, play with JDK7 and the G1 collector, but
nobody here will vouch for such an SLA in the 99th percentile.
I heard some folks have experimented with 30GB heaps and G1 and have reported
max GC
48 matches
Mail list logo