Thanks for clarifying the usage part out Stack!
As for programmatic monitoring, I was trying to figure out how to extend the
already available metrics capture mechanism (available for Region server and
Master server processes) to dump some *custom *metrics into a file (using
the
After some dig up. it was some goof up with the jar file i loaded..
It works in both the case flawlessly now.
./Zahoor
On 26-Sep-2012, at 3:35 PM, Julian Wissmann julian.wissm...@sdace.de wrote:
DoubleColumnInterpreter
On Tue, Oct 2, 2012 at 9:05 AM, lars hofhansl lhofha...@yahoo.com wrote:
You probably executed 120k next() RPC against your server, unless you enabled
scanner caching.
(On a related note, we should probably not default this to 1, but something
more sensible, like 10 or 100).
We use 100.
--
Hi there,
Another thing to consider on top of the scan-caching is that that HBase is
doing more in the process of scanning the table. See...
http://hbase.apache.org/book.html#conceptual.view
http://hbase.apache.org/book.html#regions.arch
... Specifically, processing the KeyValues,
Thank you all! Setting a cache size helped a great deal. It's still slower
though.
I think it might be possible that the overhead of processing the data from the
table might be the cause.
I guess if HBase adds an indirection to the HDFS then it makes sense that it'd
be slower, right?
On
If you take Hbase out of it and think of it from the standpoint of 2
programs, one of which opens a file and write the output to another file,
and the other one which actually processes each row and then writes out
results, the 2nd one is going to be slower because it's doing more,
ceteris
Hello
2012/10/2 Marcos Ortiz mlor...@uci.cu
Another thing that I´m seeing is that one of your main process is
compaction,
so you can optimize all this inceasing the size of your regions (by
defaulf the size of a
region is 256 MB), but you will have in your hands a split/compaction
storm
You really don't want to go to 20GB.
Without knowing the number of regions... going beyond 1-2 GB may cause more
headaches than its worth.
Sorry, but I tend to be very cautious when it comes to tuning.
-Mike
On Oct 2, 2012, at 9:20 AM, Damien Hardy dha...@viadeoteam.com wrote:
Hello
Hello,
I am designing a HBase table for users and hope to get some
suggestions for my row key design. Thanks...
This user table will have columns which include user information such
as names, birthday, gender, address, phone number, etc... The first
time user comes to us we will ask all these
Hi all,
I was wondering how many HBase users there are in Paris (France...).
Would you guys be interested in participating in a Paris-based user group?
The idea would be to share HBase practises, with something like a meet-up
per quarter.
Reply to me directly or on the list, as you prefer.
Hi Lars,
That's an interesting observation, I haven't thought before about scans in
HAcid. Your suggestion for a solution is really close to what I would do:
implement HAcidScan as a HBase Scan that filters according to the cache column.
Thanks, I will add this feature to the to do list.
--
For information, there is a Hadoop User Group France in Paris.
https://twitter.com/hugfrance
You might want to get in touch. HBase is clearly in topic.
Regards
Bertrand
On Tue, Oct 2, 2012 at 4:32 PM, n keywal nkey...@gmail.com wrote:
Hi all,
I was wondering how many HBase users there
And we are looking for speakers. It does not need to be formal/theoretical
presentations. It can also be a feedback on your own experience.
You can submit proposals on the website : http://hugfrance.fr/
Regards
Bertrand
On Tue, Oct 2, 2012 at 4:56 PM, Bertrand Dechoux decho...@gmail.com wrote:
Thanks for the suggestions.
I was attempting to tune the GC via mapred.child.java.opts in the job's
Oozie config instead of in hbase-env.sh. I think this is why my efforts
were to no avail. It was likely having no effect on the read/write
performance. Is there any way of specifying job-specific
El 02/10/2012 11:32, Greg Ross escribió:
Thanks for the suggestions.
I was attempting to tune the GC via mapred.child.java.opts in the job's
Oozie config instead of in hbase-env.sh. I think this is why my efforts
were to no avail. It was likely having no effect on the read/write
performance.
I'm in !
On Tue, Oct 2, 2012 at 5:21 PM, Bertrand Dechoux decho...@gmail.com wrote:
And we are looking for speakers. It does not need to be formal/theoretical
presentations. It can also be a feedback on your own experience.
You can submit proposals on the website : http://hugfrance.fr/
Hi there, while this isn't an answer to some of the specific design
questions, this chapter in the RefGuide can be helpful for general design..
http://hbase.apache.org/book.html#schema
On 10/2/12 10:28 AM, Jason Huang jason.hu...@icare.com wrote:
Hello,
I am designing a HBase table for
Thanks Mohammad.
The issue about phone number is that it tends to change over time and
we think name and DOB are more reliable. SSN is more unique but the
issue is that we can't force the user to provide it. Basically we have
limited information that can be used.
thanks,
Jason
On Tue, Oct 2,
Can you try using hbck ?
In the future, don't remove anything before using hbck.
Thanks
On Oct 2, 2012, at 3:55 PM, Shumin Wu shumin...@gmail.com wrote:
Hi,
I am using HBase 0.92 and got stuck with deletion/recreation of a phantom
table. The table became phantom because hbase server
Hello,
May be silly question.
Data in WAL is written in small transactions. One transaction is a set of
KeyValues for specific (single) row. As we want each written transaction to
be durable we write them into the WAL one-by-one (ideally with FS sync()
calls, etc. on each write). Which is very
This is an interesting observation. I have not thought about HBASE-5229 in
terms of a performance improvement.
Currently HRegion.mutateRowsWithLocks actually acquires locks on all rows first
(since the contract here is a transaction), so (currently) you would get
unnecessarily reduced
Currently HRegion.mutateRowsWithLocks actually acquires
locks on all rows first (since the contract here is a transaction),
so (currently) you would get unnecessarily reduced concurrency
using that API for changes that do not need to be atomic.
Right, it's about unnecessarily reduced
That person should have been Lars, I think.
On Tue, Oct 2, 2012 at 7:04 PM, Alex Baranau alex.barano...@gmail.comwrote:
Currently HRegion.mutateRowsWithLocks actually acquires
locks on all rows first (since the contract here is a transaction),
so (currently) you would get unnecessarily
Hi,
Have a look at
https://github.com/sematext/HBaseMetricsContext +
http://blog.sematext.com/2011/07/31/extending-hadoop-metrics/ -- this
may lead you in the right direction if you really really need to do
this although I'm not sure if that stuff is outdated now.
If you are just trying to
Heh, yes. See HDFS-744 and HBASE-5954.
And re: doMiniBatchMutation in HRegion, it does write multiple Puts (even for
different row keys) into a single WALEdit.
-- Lars
From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org
Sent: Tuesday, October 2,
Thanks for that link Otis! This indeed allows completely overriding the
default monitoring by Hbase, however what we are looking at really is
capturing some additional metrics over and above what the monitoring is
already generating.
So, we figured a way to achieve that through co-processors as
On Mon, Oct 1, 2012 at 10:52 PM, techbuddy techbuddy...@gmail.com wrote:
As for programmatic monitoring, I was trying to figure out how to extend the
already available metrics capture mechanism (available for Region server and
Master server processes) to dump some *custom *metrics into a file
It means that in order to save space I need to use smallest Column
Qualifier (and sometimes it makes sense)...
Yes
However, why Column Family (byte array) is repeated for each KeyValue? Is
it physically repeated for each cell?
Yes CF byte[] also physically stored in every cell (every KV).. At
28 matches
Mail list logo