Hello Serega,
We use it this way here via a python image manipulation service named
Thumbor
https://github.com/thumbor/thumbor/
+ a pluggin of my own :
https://github.com/thumbor/thumbor/wiki/Plugins#thumbor_hbase-by-damien-hardy
One big advantage is you can use it with lazy loading plugin
Hello,
We are trying to export HBase table on S3 for backup purpose.
By default export tool run a map per region and we want to limit output
bandwidth on internet (to amazon s3).
We were thinking in adding some reducer to limit the number of writers
but this is explicitly hardcoded to 0 in
On Jun 4, 2014, at 5:39 AM, Damien Hardy dha...@viadeoteam.com wrote:
Hello,
We are trying to export HBase table on S3 for backup purpose.
By default export tool run a map per region and we want to limit output
bandwidth on internet (to amazon s3).
We were thinking in adding some reducer
-Mike
On Dec 18, 2012, at 3:33 AM, Damien Hardy dha...@viadeoteam.com wrote:
Hello,
There is middle term betwen sequecial keys (hot spoting risk) and md5
(heavy scan):
* you can use composed keys with a field that can segregate data
(hostname, productname, metric name) like OpenTSDB
are primarily going to access
the data. You can then determine the best way to store the data to gain
the best performance. For some applications... the region hot spotting
isn't an important issue.
Note YMMV
HTH
-Mike
On Dec 18, 2012, at 3:33 AM, Damien Hardy dha...@viadeoteam.com wrote:
Hello
:408)
at java.lang.Thread.run(Thread.java:722)
Doesn't that mean, HBase can only specify a real name node, not
name-service ID? if so, HBase will be failed if Namenode crashed even if
configured HDFS HA.
--
Damien HARDY
IT Infrastructure Architect
Viadeo - 30 rue de la Victoire - 75009
And about the visibility of HDFS conf in HBase classpath, so ?
(classpath should appear in startup log of HBase processes)
2013/5/24 Azuryy Yu azury...@gmail.com
they are all configured as you pointed.
--Send from my Sony mobile.
On May 24, 2013 6:18 PM, Damien Hardy dha...@viadeoteam.com
ideas?
Thanks a million.
Regards,
Shahab
--
Damien HARDY
IT Infrastructure Architect
Viadeo - 30 rue de la Victoire - 75009 Paris - France
workflow).
The more efficient (1 job) would be pure home made Java MapReduce (mapper
only for each MySQL DB bulk loading on HTables)
Cheers,
--
Damien HARDY
Actually the concurency is limited by the number of map slots available in
the Jobtracker (MR1).
The last map tasks wait for the first ones to finish.
--
Damien HARDY
improvements in HBase
The slides have been posted up on meetup. See Andrew's listing of them on
the main page: http://www.meetup.com/hbaseusergroup/events/96584102/
St.Ack
--
Damien HARDY
IT Infrastructure Architect
Viadeo - 30 rue de la Victoire - 75009 Paris - France
IMO the easier would be hbase export. For long term offline backup (for
disaster recovery). It can even be stored on a different hdfs storage than
the one used by hbase using a full hdfs:// url as destination directory.
Le 5 mars 2013 22:52, Leonid Fedotov lfedo...@hortonworks.com a écrit :
Hello,
Why not using a PIG script for that ?
make the json file available on HDFS
Load with
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/JsonLoader.html
Store with
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html
of this e-mail message, contents or its
attachment other than by its intended recipient/s is strictly prohibited.
Visit us at http://www.polarisFT.com
--
Damien HARDY
Hi there,
Thank you, and happy new year.
I had the same problematic and wrote a python module⁰ for thumbor¹
I use the Thrift interface for HBase to store image blobs.
As allready said you have to keep images blob quite small (for latency
problematic in web you have to keep them small too) ~100ko,
Hello Jean-Marc,
BloomFilters are just designed for that.
But they say if a row doesn't exist with a ash of the key (not the oposit,
2 rowkeys could have the same ash result).
If you want to be sure the rowkey exists you have to search for it in the
HFile ( the whole mechanism is transparent
a range of date value one time on the date by
MD5. How to balance this issue?
Thanks.
--
Damien HARDY
| http://www.ids-mannheim.de
Projekt KorAP | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform
--
Damien HARDY
IT Infrastructure Architect
Viadeo - 30 rue
To be correct my remark about HDFS HA is not even relevant for HBase client
just need zk quorum
2012/11/14 Carsten Schnober schno...@ids-mannheim.de
Am 14.11.2012 17:04, schrieb Damien Hardy:
Hi Damien,
if zookeeper is running on server1 so
No need for ssh access : just run `hbase
out as an HFile.
On Fri, Nov 9, 2012 at 8:52 AM, Damien Hardy dha...@viadeoteam.com
wrote:
Ok I can reply to myself ...
you have to add a clone of the KeyValue in the Put. So
p.add(kv);
becomes
p.add(kv.clone());
If not, I suppose only the last one is added in HBase
Ok I can reply to myself ...
you have to add a clone of the KeyValue in the Put. So
p.add(kv);
becomes
p.add(kv.clone());
If not, I suppose only the last one is added in HBase (but the result is
quite weird and should be fixed IMO)
Cheers,
--
Damien
2012/11/9 Damien Hardy dha
Hello,
Take a look at http://opentsdb.net/overview.html it's really look like what
your are describing.
Cheers
2012/10/3 Wendy Buster stevebuster...@hotmail.com
I use a data historian (sometimes called time series database) for
collecting and persisting large (billions) of rows of
Hello
2012/10/2 Marcos Ortiz mlor...@uci.cu
Another thing that I´m seeing is that one of your main process is
compaction,
so you can optimize all this inceasing the size of your regions (by
defaulf the size of a
region is 256 MB), but you will have in your hands a split/compaction
storm
, if you find bugs while a release is
in progress, it increases your chances to get your bugs fixed...
Nicolas
On Thu, Sep 27, 2012 at 10:37 AM, Damien Hardy dha...@viadeoteam.com
wrote:
Actually, I have an old cluster on on prod with 0.90.3 version installed
manually and I am working
Hello,
Corollary, what is the better way to migrate data from a 0.90 cluster to a
0.92 cluser ?
Hbase 0.90 = Client 0.90 = stdout | stdin = client 0.92 = Hbase 0.92
All the data must tansit on a single host where compute the 2 clients.
It may be paralalize with mutiple version working with
more complex? A kind of realtime
replication between two clusters in two different versions?
On Thu, Sep 27, 2012 at 9:56 AM, Damien Hardy dha...@viadeoteam.com
wrote:
Hello,
Corollary, what is the better way to migrate data from a 0.90 cluster to
a
0.92 cluser ?
Hbase 0.90 = Client
Le 20/07/2012 18:22, Jonathan Bishop a écrit :
Hi,
I know it is a commonly suggested to use an MD5 checksum to create a row
key from some other identifier, such as a string or long. This is usually
done to guard against hot-spotting and seems to work well.
My concern is that there no guard
Hi Jean-Marc,
I reply in your text.
Le 12/06/2012 23:42, Jean-Marc Spaggiari a écrit :
Hi,
I have read all the documentation here
http://hbase.apache.org/book/book.html and I now have few questions.
I currently have a mysql table with millions of lines (4 for now, but
it's growing by 4
is : how to create Scanner in Json ?
And how specify filter ? (only documentated as string in schema)
Thank you,
Cheers,
--
Damien Hardy
signature.asc
Description: OpenPGP digital signature
Hello,
I try to copy a table from on cluster to another.
source is a 2 nodes cluster 16cpu / 32GoRAM (hadoop001, hadoop002).
destination is a 3 nodes cluster 16cpu /64GoRAM (hbase01, hbase02, hbase04).
nodes are all implementing datanode, regionserver,masterserver and
zookeeper of CDH3u3
region
for is that if you are starting with
421 regions on the source but the dest table isn't pre-split then it's
going to try to slam all the data into one region and then have to split
(and split and split, etc.).
http://hbase.apache.org/book.html#perf.writing
On 5/7/12 8:22 AM, Damien HARDY dha
Hello,
If you have the default /etc/zookeeper/zoo.cfg try to rename or remove it.
It takeover the zookeeper Hbase quorum configuration of hbase-site.xml
Cheers,
--
Damien
Le 07/05/2012 17:17, Subir S a écrit :
Hello,
Version:0.90.4-CDH3U3
HBase managed ZK
I tried to run a simple
Le 08/03/2012 09:18, Mohammad Tariq a écrit :
Hello list,
We are planning to index our data stored in HBase using Solr.As we
are totally new to Solr, we would like to have some comments from
someone who is already doing it..While looking over the internet we
came across Liliy.Is there any
Hello,
I wrote some code in python using Hbase as image storage.
I want my code to be tested independently of some external Hbase full
architecture so my question is :
Is there some howto helping on instantiate a temporary local
minicluster + thrift interface in order to pass python (or maybe
Le 26/01/2012 14:43, yonghu a écrit :
Hello,
I read this blog http://outerthought.org/blog/465-ot.html. It mentions
that every 24 hours the major compaction will occur. My question is
that if there are any other conditions which can trigger major
compaction happening? For example, when the
Hello,
I created yesterday an HTable with 2 CF specifying the TTL for 5 an 10
min respectively.
Inserted 2 datas (one in each column)
And hoped that my values desapear passed a certain amount of time.
This never happend ...
This morning I keep hope that major_compaction once a days
= '60'
TTL = '30'
It's just three orders of magnitude different from what you thought
you set the TTL to :)
J-D
On Fri, Sep 23, 2011 at 2:22 AM, Damien Hardy dha...@figarocms.fr wrote:
Hello,
I created yesterday an HTable with 2 CF specifying the TTL for 5 an 10 min
37 matches
Mail list logo