In that case, then hypothetically speaking, you could disable HBase
blockcache on the table containing static content and rely on an external
reverse proxy tier, and enable HBase blockcache on the tables that you are
using as part of generation of dynamic content.
On Mon, Jan 28, 2013 at 1:44 PM,
Hi Andy,
Thanks a lot for sharing. Yes. I am not talking about static content
caching, which may be called as internal CDN today.
I am asking some techniques of configuring cache on different layers with
concerning about avoiding duplicate caching on different layers.
thanks and regards,
Yiyu
You bring up a very common consideration I think.
For static content, such as images, then a cache can help offload read load
from the datastore. This fits into this conversation.
For dynamic content, then an external caching may not be helpful as you
say, although blockcache within HBase will he
t; > > >> to 5G.
> > >> >> > > >>
> > >> >> > > >> And last but not least, its very important to have good GC
> > setup:
> > >> >> > > >>
> > >> >> >
> >> -XX:+HeapDumpOnOutOfMemoryError
> >> >> -Xloggc:$HBASE_HOME/logs/gc-hbase.log
> >> >> > \
> >> >> > > >> -XX:MaxTenuringThreshold=15 -XX:SurvivorRatio=8 \
> >> >> > > >> -XX:+UseParNewGC \
&g
ficult
>> >> > > >> >> > to
>> >> > > >> >> > > manage HBase without very hard operations and
> maintenance in
>> >> > > play.
>> >> > > >> Jack
>> >> > > >> >>
-XX:-UseAdaptiveSizePolicy \
> >> > > >> -XX:+CMSParallelRemarkEnabled \
> >> > > >> -XX:-TraceClassUnloading
> >> > > >> "
> >> > > >>
> >> > > >> -Jack
> >> > > >>
> &g
>> >> > > >> blockCacheHitRatio=94, blockCacheHitCachingRatio=98
>>> >> >> > > >>
>>> >> >> > > >> Note, that memstore is only 2G, this particular regionserver
>>> >> HEAP is
>>> >> >> > set
>>>
mportant to have good GC
>> setup:
>> >> >> > > >>
>> >> >> > > >> export HBASE_OPTS="$HBASE_OPTS -verbose:gc -Xms5000m
>> >> >> > > >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails
>>
00m
> >> >> > > >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails
> >> >> > > >> -XX:+PrintGCDateStamps
> >> >> > > >> -XX:+HeapDumpOnOutOfMemoryError
> >> >> -Xloggc:$HBASE_H
Threshold=15 -XX:SurvivorRatio=8 \
>> >> > > >> -XX:+UseParNewGC \
>> >> > > >> -XX:NewSize=128m -XX:MaxNewSize=128m \
>> >> > > >> -XX:-UseAdaptiveSizePolicy \
>> >> > > >> -XX:+CMSParallelRemarkEnabled \
>> &
; > >> "
> >> > > >>
> >> > > >> -Jack
> >> > > >>
> >> > > >> On Thu, Jan 17, 2013 at 3:29 PM, Varun Sharma <
> va...@pinterest.com>
> >> > > wrote:
> >> > > &g
;
>> > > >> On Thu, Jan 17, 2013 at 3:29 PM, Varun Sharma
>> > > wrote:
>> > > >> > Hey Jack,
>> > > >> >
>> > > >> > Thanks for the useful information. By flush size being 15 %, do
>> you
>> >
t; > Thanks for the useful information. By flush size being 15 %, do
> you
> > > mean
> > > >> > the memstore flush size ? 15 % would mean close to 1G, have you
> seen
> > > any
> > > >> > issues with flushes taking too long ?
> > > >> >
&g
> Varun
> > >> >
> > >> > On Sun, Jan 13, 2013 at 8:17 AM, Jack Levin
> > wrote:
> > >> >
> > >> >> That's right, Memstore size , not flush size is increased.
> Filesize
> > is
> > >> >> 10G. Overall w
7;s right, Memstore size , not flush size is increased. Filesize
> is
> >> >> 10G. Overall write cache is 60% of heap and read cache is 20%. Flush
> >> size
> >> >> is 15%. 64 maxlogs at 128MB. One namenode server, one secondary that
> >> can
>
gt; >> 10G. Overall write cache is 60% of heap and read cache is 20%. Flush
>> size
>> >> is 15%. 64 maxlogs at 128MB. One namenode server, one secondary that
>> can
>> >> be promoted. On the way to hbase images are written to a queue, so
>> that we
>> >> is 15%. 64 maxlogs at 128MB. One namenode server, one secondary that
>> can
>> >> be promoted. On the way to hbase images are written to a queue, so
>> that we
>> >> can take Hbase down for maintenance and still do inserts later.
>> ImageShack
>&g
> the
> >> > >> > index in HBase which points to the file and then fetching the
> file.
> >> > >> > It could be faster... we found storing binary data in a sequence
> >> file
> >> > and
> >> > >> > indexed on HBase to be faster than HBase,
>> has ‘perma cache’ servers that allows writes and serving of data even when
>> hbase is down for hours, consider it 4th replica 😉 outside of hadoop
>>
>> Jack
>>
>> *From:* Mohit Anchlia
>> *Sent:* January 13, 2013 7:48 AM
>> *To:* u
ry 13, 2013 7:48 AM
> *To:* user@hbase.apache.org
> *Subject:* Re: Storing images in Hbase
>
> Thanks Jack for sharing this information. This definitely makes sense when
> using the type of caching layer. You mentioned about increasing write
> cache, I am assuming you had to incr
M
*To:* user@hbase.apache.org
*Subject:* Re: Storing images in Hbase
Thanks Jack for sharing this information. This definitely makes sense when
using the type of caching layer. You mentioned about increasing write
cache, I am assuming you had to increase the following parameters in
addition to increas
and
> keep
> >> > and
> >> > > update the metadata and the offset to the HBase. Because if you put
> >> > bigger
> >> > > image in hbase it wil lead to some issue.
> >> > >
> >> > >
> >> > >
> >> > > ∞
> &
Jack Levin
To: user@hbase.apache.org
Cc: Andrew Purtell
Sent: Thursday, January 10, 2013 9:24 AM
Subject: Re: Storing images in Hbase
We stored about 1 billion images into hbase with file size up to 10MB.
Its been running for close to 2 years without issues and serves
delivery of images for Yfro
gt;> > >
>>> > >
>>> > > ∞
>>> > > Shashwat Shriparv
>>> > >
>>> > >
>>> > >
>>> > > On Fri, Jan 11, 2013 at 9:21 AM, lars hofhansl
>>> wrote:
>>> > >
>>>
it wil lead to some issue.
>> > >
>> > >
>> > >
>> > > ∞
>> > > Shashwat Shriparv
>> > >
>> > >
>> > >
>> > > On Fri, Jan 11, 2013 at 9:21 AM, lars hofhansl
>> wrote:
>> > >
shwat Shriparv
> > >
> > >
> > >
> > > On Fri, Jan 11, 2013 at 9:21 AM, lars hofhansl
> wrote:
> > >
> > >> Interesting. That's close to a PB if my math is correct.
> > >> Is there a write up about this somewhere? Something that
t 9:21 AM, lars hofhansl wrote:
> >
> >> Interesting. That's close to a PB if my math is correct.
> >> Is there a write up about this somewhere? Something that we could link
> >> from the HBase homepage?
> >>
> >> -- Lars
> >>
> >
lose to a PB if my math is correct.
>> Is there a write up about this somewhere? Something that we could link
>> from the HBase homepage?
>>
>> -- Lars
>>
>>
>> - Original Message -
>> From: Jack Levin
>> To: user@hbase.apache.org
>&
gt; From: Jack Levin
> To: user@hbase.apache.org
> Cc: Andrew Purtell
> Sent: Thursday, January 10, 2013 9:24 AM
> Subject: Re: Storing images in Hbase
>
> We stored about 1 billion images into hbase with file size up to 10MB.
> Its been running for close to 2 years without
24 AM
Subject: Re: Storing images in Hbase
We stored about 1 billion images into hbase with file size up to 10MB.
Its been running for close to 2 years without issues and serves
delivery of images for Yfrog and ImageShack. If you have any
questions about the setup, I would be glad to answer
Been there, done that... kind of an interesting problem...
Someone earlier said that HBase isn't good for images. It works pretty well,
again it depends on the use case.
Your schema is also going to play a role and you're going to have to tune
things a little differently because when you pull
Thanks Leonid.
Warm Regards,
Tariq
https://mtariq.jux.com/
On Fri, Jan 11, 2013 at 2:15 AM, Leonid Fedotov wrote:
> I'm voting for continuing here as well…
> So, location is up to Jack. :)
>
> Thank you!
>
> Sincerely,
> Leonid Fedotov
>
> On Jan 10, 2013, at 11:24 AM, Mohammad Tariq wrote:
>
>
I'm voting for continuing here as well…
So, location is up to Jack. :)
Thank you!
Sincerely,
Leonid Fedotov
On Jan 10, 2013, at 11:24 AM, Mohammad Tariq wrote:
> Jack, Leonid,
>
>I request you guys to please continue the discussion
> through the thread itself if possible for you both. I wo
This is a very interesting setup to analyze. I´m working in a similar
problem
with HBase, so, any help is welcome.
El 10/01/2013 16:39, Doug Meil escribió:
+1.
This question comes up enough on the dist-list it's worth getting some
pointers on record.
On 1/10/13 2:24 PM, "Mohammad Tariq"
+1.
This question comes up enough on the dist-list it's worth getting some
pointers on record.
On 1/10/13 2:24 PM, "Mohammad Tariq" wrote:
>Jack, Leonid,
>
>I request you guys to please continue the discussion
>through the thread itself if possible for you both. I would
>like to know a
Jack, Leonid,
I request you guys to please continue the discussion
through the thread itself if possible for you both. I would
like to know about Jack's setup. I too find it quite interesting.
Many thanks.
Warm Regards,
Tariq
https://mtariq.jux.com/
On Fri, Jan 11, 2013 at 12:50 AM, Leonid
It might be interesting to share that here, just in case someone else
is facing the same usecase?
JM
2013/1/10, Leonid Fedotov :
> Jack,
> yes, this is very interesting to know your setup details.
> Could you please provide more information?
> Or we can take this off the list if you like…
>
> Tha
Jack,
yes, this is very interesting to know your setup details.
Could you please provide more information?
Or we can take this off the list if you like…
Thank you!
Sincerely,
Leonid Fedotov
On Jan 10, 2013, at 9:24 AM, Jack Levin wrote:
> We stored about 1 billion images into hbase with file si
We stored about 1 billion images into hbase with file size up to 10MB.
Its been running for close to 2 years without issues and serves
delivery of images for Yfrog and ImageShack. If you have any
questions about the setup, I would be glad to answer them.
-Jack
On Sun, Jan 6, 2013 at 1:09 PM, Mo
I have done extensive testing and have found that blobs don't belong in the
databases but are rather best left out on the file system. Andrew outlined
issues that you'll face and not to mention IO issues when compaction occurs
over large files.
On Sun, Jan 6, 2013 at 12:52 PM, Andrew Purtell wrot
I meant this to say "a few really large values"
On Sun, Jan 6, 2013 at 12:49 PM, Andrew Purtell wrote:
> Consider if the split threshold is 2 GB but your one row contains 10 GB as
> really large value.
--
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back.
What do you mean by "very large"?
One possible source of performance concern is HBase RPC does not do
positioned/chunked/partial reads, so both on the RegionServer and client
the entirety of value data will be in the heap. A lot of really large
objects brought in this way under high concurrency ca
To add to Andy's point - storing images in HBase is fine as long as
the size of each image isn't huge. A couple for MBs per row in HBase
do just fine. But once you start getting into 10s of MBs, there are
more optimal solutions you can explore and HBase might not be the best
bet.
Amandeep
On Jan
What's the penalty performance wise of saving a very large value in a
KeyValue in hbase? Splits, scans, etc.
Sent from my iPad
On 6 בינו 2013, at 22:12, Andrew Purtell wrote:
> Also YFrog / ImageShack serves all of its assets out of HBase too, so for
> reasonably sized images some are having su
Also YFrog / ImageShack serves all of its assets out of HBase too, so for
reasonably sized images some are having success. See
http://www.slideshare.net/jacque74/hug-hbase-presentation
On Sun, Jan 6, 2013 at 3:58 AM, Yusup Ashrap wrote:
> there are a lot great discussions on Quora on this topic
there are a lot great discussions on Quora on this topic.
http://www.quora.com/Apache-Hadoop/Is-HBase-appropriate-for-indexed-blob-storage-in-HDFS
http://www.quora.com/Is-it-possible-to-use-HDFS-HBase-to-serve-images
http://www.quora.com/What-is-a-good-choice-for-storing-blob-like-files-in-a-distri
Hi there,
Thank you, and happy new year.
I had the same problematic and wrote a python module⁰ for thumbor¹
I use the Thrift interface for HBase to store image blobs.
As allready said you have to keep images blob quite small (for latency
problematic in web you have to keep them small too) ~100ko, s
48 matches
Mail list logo