Been there, done that... kind of an interesting problem... 

Someone earlier said that HBase isn't good for images.  It works pretty well, 
again it depends on the use case.

Your schema is also going to play a role and you're going to have to tune 
things a little differently because when you pull an image, you're pulling a 
larger chunk of data as well as you want to make sure you can fit a decent 
number of images within a region. 


How are you planning on using the images? Are you going to run a M/R job and 
see if you can't spot landmarks and businesses in a photo? Language 
translations? 
Or just a repository? 


On Jan 10, 2013, at 12:23 PM, Marcos Ortiz <mlor...@uci.cu> wrote:

> This is a very interesting setup to analyze. I´m working in a similar problem
> with HBase, so, any help is welcome.
> 
> El 10/01/2013 16:39, Doug Meil escribió:
>> +1.
>> 
>> This question comes up enough on the dist-list it's worth getting some
>> pointers on record.
>> 
>> 
>> 
>> 
>> 
>> On 1/10/13 2:24 PM, "Mohammad Tariq" <donta...@gmail.com> wrote:
>> 
>>> Jack, Leonid,
>>> 
>>>    I request you guys to please continue the discussion
>>> through the thread itself if possible for you both. I would
>>> like to know about Jack's setup. I too find it quite interesting.
>>> 
>>> Many thanks.
>>> 
>>> Warm Regards,
>>> Tariq
>>> https://mtariq.jux.com/
>>> 
>>> 
>>> On Fri, Jan 11, 2013 at 12:50 AM, Leonid Fedotov
>>> <lfedo...@hortonworks.com>wrote:
>>> 
>>>> Jack,
>>>> yes, this is very interesting to know your setup details.
>>>> Could you please provide more information?
>>>> Or we can take this off the list if you likeŠ
>>>> 
>>>> Thank you!
>>>> 
>>>> Sincerely,
>>>> Leonid Fedotov
>>>> 
>>>> On Jan 10, 2013, at 9:24 AM, Jack Levin wrote:
>>>> 
>>>>> We stored about 1 billion images into hbase with file size up to 10MB.
>>>>> Its been running for close to 2 years without issues and serves
>>>>> delivery of images for Yfrog and ImageShack.  If you have any
>>>>> questions about the setup, I would be glad to answer them.
>>>>> 
>>>>> -Jack
>>>>> 
>>>>> On Sun, Jan 6, 2013 at 1:09 PM, Mohit Anchlia <mohitanch...@gmail.com>
>>>> wrote:
>>>>>> I have done extensive testing and have found that blobs don't belong
>>>> in
>>>> the
>>>>>> databases but are rather best left out on the file system. Andrew
>>>> outlined
>>>>>> issues that you'll face and not to mention IO issues when compaction
>>>> occurs
>>>>>> over large files.
>>>>>> 
>>>>>> On Sun, Jan 6, 2013 at 12:52 PM, Andrew Purtell <apurt...@apache.org>
>>>> wrote:
>>>>>>> I meant this to say "a few really large values"
>>>>>>> 
>>>>>>> On Sun, Jan 6, 2013 at 12:49 PM, Andrew Purtell
>>>> <apurt...@apache.org>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Consider if the split threshold is 2 GB but your one row contains
>>>> 10
>>>> GB
>>>>>>> as
>>>>>>>> really large value.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> 
>>>>>>>   - Andy
>>>>>>> 
>>>>>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>>> Hein
>>>>>>> (via Tom White)
>>>>>>> 
>>>> 
> 
> -- 
> 
> Marcos Ortíz Valmaseda
> Blog: http://marcosluis2186.posterous.com
> Twitter: @marcosluis2186 <http://twitter.com/marcosluis2186>
> 
> 
> 
> 
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci

Reply via email to