On Tue, Nov 13, 2018 at 7:20 PM Antonio Si wrote:
> Thanks Allan.
>
> Then, why is it a problem of having too many column families? If there are
> column
> families with no data, would that cause any issues?
>
> Thanks.
>
>
`We have this note in the refguide [1]. It i
Thanks Allan.
Then, why is it a problem of having too many column families? If there are
column
families with no data, would that cause any issues?
Thanks.
Antonio.
On Tue, Nov 13, 2018 at 7:09 PM Allan Yang wrote:
> No, Every column family has its own memstore. Each one is 128MB in y
7:34写道:
> Hi,
>
> I would like to confirm my understand.
>
> Let's say I have 13 column families in a hbase table. 11 of those column
> families have no data, which 2 column families have large amount of data.
>
> My understanding is that the size of memstore,
Hi,
I would like to confirm my understand.
Let's say I have 13 column families in a hbase table. 11 of those column
families have no data, which 2 column families have large amount of data.
My understanding is that the size of memstore, which is 128M in my env,
will be shared across all column
Stack, sorry for the late answer. Took me a while to get to this.
On Thu, Aug 2, 2018 at 6:30 PM, Stack wrote:
> On Thu, Jul 12, 2018 at 4:31 AM Lars Francke
> wrote:
> >
> > I've got a question on the number of column families. I've told everyone
> > for years tha
On Thu, Jul 12, 2018 at 4:31 AM Lars Francke wrote:
>
> I've got a question on the number of column families. I've told everyone
> for years that you shouldn't use more than maybe 3-10 column families.
>
> Our book still says the following:
> "HBase currently does not do w
se layer, which produces some space and
>> query time benefits (and has some tradeoffs). So where I work the ideal is
>> one CF, although because we have legacy tables it is not universally
>> applied.
>>
>>
>> On Thu, Jul 12, 2018 at 4:31 AM Lars Francke
>> wrote
I work the ideal is
> one CF, although because we have legacy tables it is not universally
> applied.
>
>
> On Thu, Jul 12, 2018 at 4:31 AM Lars Francke
> wrote:
>
> > I've got a question on the number of column families. I've told everyone
> > for years that yo
duces some space and
query time benefits (and has some tradeoffs). So where I work the ideal is
one CF, although because we have legacy tables it is not universally
applied.
On Thu, Jul 12, 2018 at 4:31 AM Lars Francke wrote:
> I've got a question on the number of column families. I've
I've got a question on the number of column families. I've told everyone
for years that you shouldn't use more than maybe 3-10 column families.
Our book still says the following:
"HBase currently does not do well with anything above two or three column
families so keep the number of c
wrote:
> I get:
> return result->Value(family, qualifier).value()
>
> result is optional, OK - it works.
> But sometimes I must read unknown structure of table, or more often, I know
> families but I don't know qualifiers.
>
> P.S. At long last I succeed building HBase clien
I get:
return result->Value(family, qualifier).value()
result is optional, OK - it works.
But sometimes I must read unknown structure of table, or more often, I
know families but I don't know qualifiers.
P.S. At long last I succeed building HBase client out of Docker,
probably I can e
saying we still need to use
addColumnFamily to limit scan to 1 c/f? Here is the code for that test,
should addColumnFamily (or addColumn??) be used here, or it will read all
column families?
return new Scan(startInclusive, endExclusive) .setFilter(new
FilterList(FilterList.Operator.MUST_PASS_ALL, new
In HBase even if you say keyOnlyFilter there is a column family involved.
In this case if the scan does not specify addfamily() then I think all the
column families will be loaded.
Regards
Ram
On Tue, Aug 22, 2017 at 6:47 PM, Partha <parthaema...@gmail.com> wrote:
> One other observati
cribe 'TABLE1'
> Table TABLE1 is ENABLED
> TABLE1
> COLUMN FAMILIES DESCRIPTION
> {NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY =>
> 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIFF',
> TTL => 'FOREVER', COMPRESSION =&g
don't read other families at all. (with or without
encoding).
Regards
Ram
On Tue, Aug 22, 2017 at 10:49 AM, ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com> wrote:
> Can you try one more thing - instead of addFamily try using
> addColumn(byte[] fam, byte[] qual). Since
to be sure - are you sure that the 4 CF table has only one
qualifier?
Regards
Ram
On Tue, Aug 22, 2017 at 8:17 AM, Partha <parthaema...@gmail.com> wrote:
> hbase(main):001:0> describe 'TABLE1'
> Table TABLE1 is ENABLED
> TABLE1
> COLUMN FAMILIES DESCRIPTION
> {NAME =>
hbase(main):001:0> describe 'TABLE1'
Table TABLE1 is ENABLED
TABLE1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY =>
'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIFF',
TTL => 'FOREVER', COMPRESSION =&g
> >
> > -Anoop-
> >
> > On Sun, Aug 20, 2017 at 4:36 AM, Partha <parthaema...@gmail.com> wrote:
> >> Anoop,
> >>
> >> Yes, each column family (in both tables) uses the same encoding
> >> (fast-diff)
> >> and same compression (gzip).
> >>
> >> I suggest you to just try the simple test as my case and see if you
> notice
> >> a
> >> similar drop in performance (almost linear to the # of column families)
> >
> >
>
>
>
>> and same compression (gzip).
>>
>> I suggest you to just try the simple test as my case and see if you
notice
>> a
>> similar drop in performance (almost linear to the # of column families)
>
>
Will send across table statement and the test code. Pls let me know if you
find anything from your test given the inputs so far. Note that column
family has only 1 qualifier with json payload value of size 15KB. The
column families use fastdiff encoding and gzip compression.
Added user
7 at 4:42 PM, Partha <parthaema...@gmail.com> wrote:
> > I have 2 HBase tables - one with a single column family, and other has 4
> > column families. Both tables are keyed by same rowkey, and the column
> > families all have a single column qualifier each, with a json string as
Scan();
s.setStartRow
s.setStopRow
s.addFamily(cf)
Correct?
-Anoop-
On Thu, Aug 17, 2017 at 4:42 PM, Partha <parthaema...@gmail.com> wrote:
> I have 2 HBase tables - one with a single column family, and other has 4
> column families. Both tables are keyed by same rowkey, an
I have 2 HBase tables - one with a single column family, and other has 4
column families. Both tables are keyed by same rowkey, and the column
families all have a single column qualifier each, with a json string as
value (each json payload is about 10-20K in size). All column families use
fast
1
down vote
favorite
I have 2 HBase tables - one with a single column family, and other has 4
column families. Both tables are keyed by same rowkey, and the column
families all have a single column qualifier each, with a json string as
value (each json payload is about 10-20K in size). All column
I have 2 HBase tables - one with a single column family, and other has 4 column
families. Both tables are keyed by same rowkey, and the column families all
have a single column qualifier each, with a json string as value (each json
payload is about 10-20K in size). All column families use fast
t;
>
> On Thu, Jun 22, 2017 at 4:06 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> > bq. HBase doesn't do well with more than 2-3 column families
> >
> > The above is out of date - we have per column family flush which would
> > reduce the number of small hfiles.
Brian, Ted, thank you for your answers.
Ted, could you point out the HBase version where per column family flush
first appeared?
On Thu, Jun 22, 2017 at 4:06 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> bq. HBase doesn't do well with more than 2-3 column families
>
> The above is o
bq. HBase doesn't do well with more than 2-3 column families
The above is out of date - we have per column family flush which would
reduce the number of small hfiles.
bq. Why can't we just create several tables instead?
Currently hbase doesn't provide transaction across region boundary
One use-case that applies to my tables is that I have a table with a set of
columns that have data that is always processed with MR jobs, but other rather
large columns
that are generally only accessed through a UI. By separating those into two
column families, MR jobs that do a full table scan
Hi,
A general question regarding column families. It is said in the doc that
HBase doesn't do well with more than 2-3 column families because flushing
and compactions are done on a per region basis which should be addressed in
the future: http://hbase.apache.org/book.html#number.of.cfs
Hi,
I'm working on a project where we have a strange use case.
First off, we use bulk loading exclusively. We never use the put or bulk put
interface to load data into tables.
We have drivers that make me want to segregate data by tables and column
families. Our data is clearly delineated
roject where we have a strange use case.
>
> First off, we use bulk loading exclusively. We never use the put or bulk
> put interface to load data into tables.
>
> We have drivers that make me want to segregate data by tables and column
> families. Our data is clearly deline
is based on using essential column family (column family A
in
your case) to guide whether the remaining column families should be
loaded.
To be specific, if outside the TimeRange you specify (last day), your
filter returns ReturnCode.INCLUDE_AND_SEEK_NEXT_ROW.
What do you think
families should be
loaded.
To be specific, if outside the TimeRange you specify (last day), your
filter returns ReturnCode.INCLUDE_AND_SEEK_NEXT_ROW.
What do you think ?
Cheers
On Sat, Aug 1, 2015 at 8:06 PM, Dave Latham lat...@davelink.net
wrote:
Thanks
, Ted Yu yuzhih...@gmail.com wrote:
Dave:
I wonder if Filter response can be enhanced in the following manner:
http://pastebin.com/sb6apTPm
My approach is based on using essential column family (column family A in
your case) to guide whether the remaining column families should
can be enhanced in the following manner:
http://pastebin.com/sb6apTPm
My approach is based on using essential column family (column family A in
your case) to guide whether the remaining column families should be
loaded.
To be specific, if outside the TimeRange you specify (last day), your
Dave:
I wonder if Filter response can be enhanced in the following manner:
http://pastebin.com/sb6apTPm
My approach is based on using essential column family (column family A in
your case) to guide whether the remaining column families should be loaded.
To be specific, if outside the TimeRange
(column family A in
your case) to guide whether the remaining column families should be loaded.
To be specific, if outside the TimeRange you specify (last day), your
filter returns ReturnCode.INCLUDE_AND_SEEK_NEXT_ROW.
What do you think ?
Cheers
On Sat, Aug 1, 2015 at 8:06 PM, Dave Latham lat
I have a table with 2 column families, call them A and B, with new data
regularly being added. They are very different sizes: B is 100x the size of
A. Among other uses for this data, I have a MapReduce job that needs to
read all of A, but only recent data from B (e.g. last day). Here are some
Can you achieve your goal with two scans ?
The first scan specifies TimeRange corresponding to last day. This scan
returns both column families.
The other scan specifies TimeRange excluding last day. This scan returns
column family A.
Cheers
On Sat, Aug 1, 2015 at 8:35 AM, Dave Latham lat
Hi Dave,
Would HBase be willing to accept updating Scan to have different
TimeRange's for each column families?
We could try it. I'm not sure how familiar you are with the relevant code.
I'm guessing some? Look at ScanQueryMatcher. This and related concerns
govern how we search through store
Have you considered using essential column family feature (through Filter) ?
In your case A would be the essential column family.
Within TimeRange for recent data, the filter would return both column
families.
Outside the TimeRange, only family A is returned.
Cheers
On Sat, Aug 1, 2015 at 7:17
families the filter operates on (essential seems an odd name). If any
data from those column families passes the filter, then the scan loads and
includes data from the remaining families without filtering it. In my
case, it's not clear from a row's family A whether or not family B for that
row
Thanks for brainstorming, Ted. That sounds like option 2 I listed using a
separate scanner for A vs B which adds complexity to the job and gives up
the atomicity/consistency guarantees as new writes hit both column
families.
On Sat, Aug 1, 2015 at 9:07 AM, Ted Yu yuzhih...@gmail.com wrote:
Can
smaller than in A, I do not understand where is a
source of IO bottleneck?
On Aug 1, 2015 9:16 AM, Andrew Purtell apurt...@apache.org wrote:
Hi Dave,
Would HBase be willing to accept updating Scan to have different
TimeRange's for each column families?
We could try it. I'm not sure how
Hey Ted,
I was in the process of comparing insert throughputs which we
discussed using ycsb.What I could find is that when I split the data into
multiple column families the insert through is coming down to half when
compared to persisting into a single column family.Do you think
Can you give a bit more detail, such as:
the release of HBase you're using
number of column families where slowdown is observed
size of cluster
release of hadoop you're using
Thanks
On Mon, Sep 29, 2014 at 9:43 AM, Nishanth S nishanth.2...@gmail.com wrote:
Hey Ted,
I was in the process
Hbase Release: 0.96.1
Number of column families at which issue is observed is 2.Earlier I had one
single column family where all the data was persisted.In the new case I
was storing all meta data into column family 1(less than 1k) and a blob
on second column family(around 7Kb).
We have 9 node
.
Cheers
On Mon, Sep 29, 2014 at 10:00 AM, Nishanth S nishanth.2...@gmail.com
wrote:
Hbase Release: 0.96.1
Number of column families at which issue is observed is 2.Earlier I had one
single column family where all the data was persisted.In the new case I
was storing all meta data into column family
I am trying to answer the below questions in this
scenario.
1.Would seperating to multiple column families affect hbase write
performance?
2. How would if affect my read performance considering both the read cases?
3.Is there any advantage that I am gaining by seperating into multiple cfs?
I
and this huge data
chunk).In general I am trying to answer the below questions in this
scenario.
1.Would seperating to multiple column families affect hbase write
performance?
2. How would if affect my read performance considering both the read cases?
3.Is there any advantage that I am gaining
seperating to multiple column families affect hbase write
performance?
2. How would if affect my read performance considering both the read
cases?
3.Is there any advantage that I am gaining by seperating into multiple
cfs?
I would really appreciate if any one could point me
There should not be impact to hbase write performance for two column
families.
Cheers
On Thu, Sep 25, 2014 at 10:53 AM, Nishanth S nishanth.2...@gmail.com
wrote:
Thank you Ted.No I do not plan to use bulk loading since the data is
incremental in nature.
On Thu, Sep 25, 2014 at 11:36 AM
Thank you Ted.
-Nishan
On Thu, Sep 25, 2014 at 11:56 AM, Ted Yu yuzhih...@gmail.com wrote:
There should not be impact to hbase write performance for two column
families.
Cheers
On Thu, Sep 25, 2014 at 10:53 AM, Nishanth S nishanth.2...@gmail.com
wrote:
Thank you Ted.No I do not plan
We are doing schema design for our application, One thing we are not so
clear about is multiple column families (more than 3, probably 4 - 5) vs
multiple tables. In our use case, we will have the same number of rows in
all these column families, but some column families may be modified more
often
Kim [mailto:taeyun@innowireless.co.kr]
Sent: Wednesday, August 06, 2014 1:48 PM
To: user@hbase.apache.org
Subject: RE: Question on the number of column families
Thank you.
The 'dummy' column will always hold the value '1' (or even an empty string),
that only signifies that this row exists
1:48 PM
To: user@hbase.apache.org
Subject: RE: Question on the number of column families
Thank you.
The 'dummy' column will always hold the value '1' (or even an empty
string), that only signifies that this row exists. (And the real value is
in the other 'big' column family) The value
be used to minimize the scan cost.
Thank you.
-Original Message-
From: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr]
Sent: Wednesday, August 06, 2014 1:48 PM
To: user@hbase.apache.org
Subject: RE: Question on the number of column families
Thank you.
The 'dummy
that's a subclass of the RowFilter.
- In that filter class, override isFamilyEssential() method to return true only
when the name of the 'dummy' column family is passed as an argument.
Now, HBase calls isFamilyEssential() method of my filter object for all the
column families including
Hi Qiang,
thank you for your help.
1. Regarding HBASE-5416, I think it's purpose is simple.
Avoid loading column families that is irrelevant to filtering while scanning.
So, it can be applied to my 'dummy CF' case.
That is, a dummy CF can act like an 'relevant' CF to filtering, provided
the column families including the 'dummy' column family, and in result only
loads the 'dummy' column family and happily filters rowkey using the
KeyValue objects from the 'dummy' column family HFile(s).
Am I right?
BTW, it would be nice to have a method like
'setEssentialColumnFamilies(byte
Hi TaeYun,
thanks for explain.
On Thu, Aug 7, 2014 at 12:50 PM, innowireless TaeYun Kim
taeyun@innowireless.co.kr wrote:
Hi Qiang,
thank you for your help.
1. Regarding HBASE-5416, I think it's purpose is simple.
Avoid loading column families that is irrelevant to filtering while
Hi,
According to http://hbase.apache.org/book/number.of.cfs.html, having more
than 2~3 column families are strongly discouraged.
BTW, in my case, records on a table have the following characteristics:
- The table is read-only. It is bulk-loaded once. When a new data is ready,
A new
To: user@hbase.apache.org
Subject: Question on the number of column families
Hi,
According to http://hbase.apache.org/book/number.of.cfs.html, having more
than 2~3 column families are strongly discouraged.
BTW, in my case, records on a table have the following characteristics
of column families
Hi,
According to http://hbase.apache.org/book/number.of.cfs.html, having more
than 2~3 column families are strongly discouraged.
BTW, in my case, records on a table have the following characteristics:
- The table is read-only. It is bulk-loaded once. When a new data
the values for the area that is displayed on the
screen.
-Original Message-
From: Alok Kumar [mailto:alok...@gmail.com]
Sent: Tuesday, August 05, 2014 8:24 PM
To: user@hbase.apache.org
Subject: Re: Question on the number of column families
Hi,
Hbase creates HFile per column-family
cache, unless the columns are separated by individual column family.
-Original Message-
From: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr]
Sent: Tuesday, August 05, 2014 8:36 PM
To: user@hbase.apache.org
Subject: RE: Question on the number of column families
Thank you
cache, unless the columns are separated by individual column family.
-Original Message-
From: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr]
Sent: Tuesday, August 05, 2014 8:36 PM
To: user@hbase.apache.org
Subject: RE: Question on the number of column families
Thank
As Alok mentioned previously, once columns are grouped into several column
families, you would be able to leverage essential column family feature
introduced by this JIRA:
HBASE-5416 Improve performance of scans with some kind of filters
Cheers
On Tue, Aug 5, 2014 at 5:26 AM, Alok Kumar alok
that make sense?
In that example, you have 4 column families.
There are other examples, but that should help you put column families in
perspective.
HTH
-Mike
On Aug 5, 2014, at 11:52 AM, Ted Yu yuzhih...@gmail.com wrote:
As Alok mentioned previously, once columns are grouped into several
One way to model the data would be to use a composite key that is made
up of the RDMS primary_key + . + field_name. Then just have a single
column that contains the value of the field.
Individual field lookups will be a simple get and to get all of fields
of a record, you would do a scan with
Thank you all.
Facts learned:
- Having 130 column families is too much. Don't do that.
- While scanning, an entire row will be read for filtering, unless HBASE-5416
technique is applied which makes only relevant column family is loaded. (But it
seems that still one can't load just a column
, you can look at the unit test (TestJoinedScanners)
from HBASE-5416. You would understand this feature better.
Cheers
On Tue, Aug 5, 2014 at 9:21 PM, innowireless TaeYun Kim
taeyun@innowireless.co.kr wrote:
Thank you all.
Facts learned:
- Having 130 column families is too much. Don't do
: Question on the number of column families
bq. add a 'dummy' column family and apply HBASE-5416 technique
Adding dummy column family is not the way to utilize essential column family
support - what would this dummy column family hold ?
bq. since I have not read the filtering section of the book I'm
...@gmail.com wrote:
I need to generate from a 2TB dataset and exploded it to 4 Column Families.
The result dataset is likely to be 20TB or more. I'm currently using Spark
so I sorted the (rk, cf, cq) myself. It's huge and I'm considering how to
optimize it.
My question is:
Should I sort
about it because HBase sorts the row keys on its own but
lexicographically.
Cheers,
Arun
Sent from a mobile device. Please don't mind the typos.
On Jul 30, 2014 9:02 PM, Jianshi Huang jianshi.hu...@gmail.com wrote:
I need to generate from a 2TB dataset and exploded it to 4 Column
Families
...@gmail.com
wrote:
I need to generate from a 2TB dataset and exploded it to 4 Column Families.
The result dataset is likely to be 20TB or more. I'm currently using Spark
so I sorted the (rk, cf, cq) myself. It's huge and I'm considering how to
optimize it.
My question is:
Should I sort and write each
I need to generate from a 2TB dataset and exploded it to 4 Column Families.
The result dataset is likely to be 20TB or more. I'm currently using Spark
so I sorted the (rk, cf, cq) myself. It's huge and I'm considering how to
optimize it.
My question is:
Should I sort and write each column family
Lars,
when you say 'when one memstore needs to be flushed all other column
families are flushed', are you referring to other column families of the
same table, right?
2013/8/4 Rohit Kelkar rohitkel...@gmail.com
Regarding slow scan- only fetch the columns /qualifiers that you need. It
may
Pablo,
That is correct.
On Mon, Aug 5, 2013 at 10:00 AM, Pablo Medina pablomedin...@gmail.comwrote:
Lars,
when you say 'when one memstore needs to be flushed all other column
families are flushed', are you referring to other column families of the
same table, right?
2013/8/4 Rohit
Hi,
I have tested read performance after reducing number of column families
from 14 to 3 and yes there is improvement.
Meanwhile i was going through the paper published by google on BigTable.
It says
It is our intent that the number of distinct column
families in a table be small (in the hundreds
.
On Aug 4, 2013 2:29 AM, Vimal Jain vkj...@gmail.com wrote:
Hi,
I have tested read performance after reducing number of column families
from 14 to 3 and yes there is improvement.
Meanwhile i was going through the paper published by google on BigTable.
It says
It is our intent
read performance after reducing number of column families
from 14 to 3 and yes there is improvement.
Meanwhile i was going through the paper published by google on
BigTable.
It says
It is our intent that the number of distinct column
families in a table be small (in the hundreds
Vimal,
It really depends on your usage pattern but HBase != Bigtable.
On Aug 4, 2013 2:29 AM, Vimal Jain vkj...@gmail.com wrote:
Hi,
I have tested read performance after reducing number of column
families
from 14 to 3 and yes there is improvement.
Meanwhile i
columns such that a scan is often limited to a single
Column Family, you'll get huge benefit by using more Column Families.
The main consideration for many Column Families and that each has its own store
files, and hence scanning involves more seeking for each Column Families
included in a scan
Family, you'll get huge benefit by using more Column Families.
The main consideration for many Column Families and that each has its own
store files, and hence scanning involves more seeking for each Column
Families included in a scan.
They are also flushed together; when one memstore (which
Thanks Dhaval/Michael/Ted/Otis for your replies.
Actually , i asked this question because i am seeing some performance
degradation in my production Hbase setup.
I have configured Hbase in pseudo distributed mode on top of HDFS.
I have created 17 Column families :( . I am actually using 14 out
When you did the scan, did you check what the bottleneck was ? Was it I/O ?
Did you see any GC locks ? How much RAM are you giving to your RS ?
-Viral
On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:
To completely scan the table for all 140 columns , it takes around 30-40
I scanned it during normal traffic hours.There was no I/O load on the
server.
I dont see any GC locks too.
Also i have given 1.5G to RS , 512M to each Master and Zookeeper.
One correction in the post above :
Actual time to scan whole table is even more , it takes 10 mins to scan 0.1
million rows
Can someone please reply ?
Also what is the typical read/write speed of hbase and how much deviation
would be there in my scenario mentioned above (14 cf , total 140 columns ) ?
I am asking this because i am not simply printing out the scanned values ,
instead i am applying some logic on the data
Subject: Re: How many column families in one table ?
Can someone please reply ?
Also what is the typical read/write speed of hbase and how much deviation
would be there in my scenario mentioned above (14 cf , total 140 columns ) ?
I am asking this because i am not simply printing out the scanned
? Otherwise each call to next() is a RPC
roundtrip and you are basically measuring your networks RTT.
-- Lars
From: Vimal Jain vkj...@gmail.com
To: user@hbase.apache.org
Sent: Monday, July 1, 2013 4:11 AM
Subject: Re: How many column families in one table ?
Can
this question because i am seeing some performance
degradation in my production Hbase setup.
I have configured Hbase in pseudo distributed mode on top of HDFS.
I have created 17 Column families :( . I am actually using 14 out of these
17 column families.
Each column family has around on average 8
.
Actually , i asked this question because i am seeing some performance
degradation in my production Hbase setup.
I have configured Hbase in pseudo distributed mode on top of HDFS.
I have created 17 Column families :( . I am actually using 14 out of
these
17 column families.
Each column family
, 2013 4:44 AM
Subject: Re: How many column families in one table ?
Hi,
We had some hardware constraints along with the fact that our total data
size was in GBs.
Thats why to start with Hbase , we first began with pseudo distributed
mode and thought if required we would upgrade to fully
column families in one table ?
Hi,
We had some hardware constraints along with the fact that our total data
size was in GBs.
Thats why to start with Hbase , we first began with pseudo distributed
mode and thought if required we would upgrade to fully distributed mode.
On Mon, Jul 1
On Mon, Jul 1, 2013 at 10:06 AM, Vimal Jain vkj...@gmail.com wrote:
Sorry for the typo .. please ignore previous mail.. Here is the corrected
one..
1)I have around 140 columns for each row , out of 140 , around 100 columns
hold java primitive data type , remaining 40 columns contain
Vimal:
Please also refer to:
http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning
On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel michael_se...@hotmail.comwrote:
Short answer... As few as possible.
14 CF doesn't make too much sense.
Sent from
Hi All ,
Thanks for your replies.
Ted,
Thanks for the link, but its not working . :(
On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu yuzhih...@gmail.com wrote:
Vimal:
Please also refer to:
http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning
1 - 100 of 201 matches
Mail list logo