Hi Joshi,
You can use Flume + AsyncHbaseSink / HBasesink to move data from local file
sytem to HBase.
Thanks,
Surendra M
-- Surendra Manchikanti
On Wed, Apr 17, 2013 at 10:01 AM, Omkar Joshi
wrote:
> The background thread is here :
>
>
> http://mail-archives.apache.org/mod_mbox/hbase-user/2
The background thread is here :
http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3ce689a42b73c5a545ad77332a4fc75d8c1efbe84...@vshinmsmbx01.vshodc.lntinfotech.com%3E
Following are the commands that I'm using to load files onto HBase :
HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase clas
I am not sure I understand your question completely.
Were you asking the upper bound of number of regions, given certain
hardware resources ? Can you outline your expectation for throughput /
latency ?
I guess answers you may get would vary, depending on type of application,
etc.
On Tue, Apr 16,
hi,all
I want to know whether it it a criterio or bible to measure the capicity of
hbase cluster.
>From my views, it depends on:
1. hdfs volumn
2. system memory setting
3. Network IO, etc
However, with the increase of number of table and region, how to evaluate
the ability of service is not enough,
Hi Anoop,
Actually, I got confused after reading the doc. - I thought a simple importtsv
command(which also takes table name as the argument) would suffice. But as you
pointed out, completebulkload is required.
HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop
jar
Hi,
In the first half of this email, let me summarize our findings:
Yesterday afternoon Huned ad I discovered an issue while playing with HBase
Snapshots on top of Hadoop's Snapshot branch (
http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-2802/).
HDFS (built from HDFS-2802 branch) doesn'
Hi Dylan,
$HBASE_HOME/bin/hbase hbck -fix
the above command can bring back the cluster to normal state.
Just if master restarted while ROOT/META region in closing(during move or
balance) then the problem you have reported will come easily. Thanks for
detailed logs.
Raised an issue for this. You
Hello,
We had problems with not being able to scan over a large (~8k regions)
table so we disabled and dropped it and decided to re-import data from
scratch into a table with the SAME name. This never worked and I list some
log extracts below.
The only way to make the import go through was to imp
This fundamentally different, though. A scanner by default scans all regions
serially, because it promises to return all rows in sort order.
A multi get is already parallelized across regions (and hence accross region
servers).
Before we do a lot of work here we should fist make sure that nothi
I think the important thing about Column Families is trying to understand on
how to use them properly in a design.
Sparse data may make sense. It depends on the use case and an understanding of
the trade offs.
It all depends on how the data breaks down in to specific use cases.
Keeping CFs
bq. Maybe we can explain why there is some impacts, or what to consider?
The above would be covered in the JIRA.
Thanks
On Tue, Apr 16, 2013 at 7:04 AM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:
> Can we add more details than just changing the maximum CF number? Maybe we
> can expla
Can we add more details than just changing the maximum CF number? Maybe we
can explain why there is some impacts, or what to consider?
JM
2013/4/16 Ted Yu
> If there is no objection, I will create a JIRA to increase the maximum
> number of column families described here:
>
> http://hbase.apache
If there is no objection, I will create a JIRA to increase the maximum
number of column families described here:
http://hbase.apache.org/book.html#number.of.cfs
Cheers
On Mon, Apr 8, 2013 at 7:21 AM, Doug Meil wrote:
>
>
> For the record, the refGuide mentions potential issues of CF lumpiness
>
Hi Yun,
Attachements are not working on the mailing list. However, everyone
using HBase should have the book on its desk, so I have ;)
On the figure 8-11, you can see that client wil contact ZK to know
where the root region is. Then the root region to find the meta, and
so on.
BUT This will
Hi, Jean and Jieshan,
Are you saying client can directly contact region servers? Maybe I
overlooked, but I think the client may need lookup regions by first
contacting Zk as in figure 8-11 from definitive book(as attached)...
Nevertheless, if it is the case, to assign a global timestamp, what is t
Have you looked at https://github.com/yahoo/omid/wiki ?
The Status Oracle implementation may give you some clue.
Cheers
On Apr 16, 2013, at 5:14 AM, yun peng wrote:
> Hi, All,
> I'd like to add a global timestamp oracle on Zookeep to assign globally
> unique timestamp for each Put/Get issued
Yes, Jean-Marc Spaggiari is right. Performance is the big problem of this
approach, though zookeeper can help you implement this.
Regards,
Jieshan
-Original Message-
From: Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org]
Sent: Tuesday, April 16, 2013 8:20 PM
To: user@hbase.apache.or
Hi Yun,
If I understand you correctly, that mean that each time our are going to do
a put or a get you will need to call ZK first?
Since ZK has only one master active, that mean that this ZK master will be
called for each HBase get/put?
You are going to create a bottle neck there. I don't know h
Hi, All,
I'd like to add a global timestamp oracle on Zookeep to assign globally
unique timestamp for each Put/Get issued from HBase cluster. The reason I
put it on Zookeeper is that each Put/Get needs to go through it and unique
timestamp needs some global centralised facility to do it. But I am a
Hi Nicolas,
I think it might be good to create a JIRA for that anyway since seems that
some users are expecting this behaviour.
My 2¢ ;)
JM
2013/4/16 Nicolas Liochon
> I think there is something in the middle that could be done. It was
> discussed here a while ago, but without any JIRA create
I think there is something in the middle that could be done. It was
discussed here a while ago, but without any JIRA created. See thread:
http://mail-archives.apache.org/mod_mbox/hbase-user/201302.mbox/%3CCAKxWWm19OC+dePTK60bMmcecv=7tc+3t4-bq6fdqeppix_e...@mail.gmail.com%3E
If someone can spend s
So what is lacking here? The action should also been parallel inside RS for
each region, Instead of just parallel on RS level?
Seems this will be rather difficult to implement, and for Get, might not be
worthy?
>
> I looked
> at src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.ja
22 matches
Mail list logo