Hi!
First of all, I apologize for sending this kind of off-topic e-mail to this
list.
I'm from Brazil and I tried to buy a ticket to HBaseCon last week but
unfortunately tickets for HBaseCon was sold out right after I have had
authorization from my company for subscribing for both HBaseCon and
Hi Sever
Coprocessors are still new for me, so I don't have a good answer for your
second question.
But for your first, (as far as I understand) remember that you can send
Puts/Deletes in any order, and Memstore is responsible for keeping your data
sorted before flushing to a StoreFile, and k
Hi Anand,
As usual, the answer is that 'it depends' :)
I think that the main question here is: why are you afraid that this setup
would lead to region server hotspotting? Is because you don't know how your
production data will seems?
Based on what you told about your rowkey, you will query m
Hi there!
Need to share this :) Few minutes ago I got my Cloudera Certified Specialist in
Apache HBase with 42 correct answers out of 45!
I am very grateful to the following people and groups:
* All those who have shared knowledge at this list
* All those who have contributed f
and some other columns. I scan the table
> with column value filter for this case.
>
> I will evaluate salting as you have explained.
>
> Regards,
> Anand.C
>
> On Tue, Jul 17, 2012 at 12:30 AM, Cristofer Weber <
> cristofer.we...@neogrid.com> wrote:
>
> >
Hi Alex
Here we worked with bulk import creating the HFiles in a MR job and we finish
the load calling doBulkLoad method of LoadIncrementalHFiles class (probably the
same method used by completebulkload tool) and HFiles generated by reducer
tasks are correctly 'adopted' by each corresponding re
g
Hi Cristofer,
Data i store is test cell reports about a component. I have many test cell
reports for each model number + serial number combination. So to make rowkey
unique, I added timstamp.
On Wed, Jul 18, 2012 at 3:14 AM, Cristofer Weber < cristofer.we...@neogrid.com>
wrote:
> So, An
Hi Alex,
I ran one of our bulk import jobs with partial payload, without proceeding with
major compaction, and you are right: Some hdfs blocks are in a different
datanode.
-Mensagem original-
De: Alex Baranau [mailto:alex.barano...@gmail.com]
Enviada em: quarta-feira, 18 de julho de 20
We are using CDH4
Sent from my iPad
On Jul 18, 2012, at 18:48, "Tony Dean" wrote:
> We are using HBase 0.94.0 against Hadoop 1.0.3, but plan to move to 0.23.x.
>
> -Original Message-
> From: Ted Yu [mailto:yuzhih...@gmail.com]
> Sent: Wednesday, July 18, 2012 4:12 PM
> To: d...@hbase.
Hello Hari!
Just for the sake of maintaining sorted results, that's it. You have to keep it
in lexicographic order. An alternative, for example, could be maintain
date|category as RowKey and store your N URLs as members of a Column Family,
where padded_visits could be the Column Qualifier and
Hi Hari,
Using date as column qualifier is nice, but I experienced a drawback in a
scenario where I left the window open: I kept a large range of dates per RowKey
and the amount of rows per region became lower and lower as I started to split
regions.
You can manage this with TTL if you don't
Hi there
There are some really good ideas in this presentation from HBaseCon:
http://www.cloudera.com/resource/video-hbasecon-2012-real-performance-gains-with-real-time-data/
Regards,
Cristofer
-Mensagem original-
De: Alex Baranau [mailto:alex.barano...@gmail.com]
Enviada em: quinta-fe
There's also a lot of conversions from same values to byte array
representation, eg, your NeighborStructure constants. You should do this
conversion only once to save time, since you are doing this inside 3 nested
loops. Not sure about how much this can improve, but you should try this also.
Be
Just read this article, "Solving Big Data Challenges for Enterprise Application
Performance Management." published this month @ Volume 5, No.12 of Proceedings
of the VLDB Endowment, where they measured 6 different databases - Project
Voldemort, Redis, HBase, Cassandra, MySQL Cluster and VoltDB -
ice to see some of the
more recent work done in the area of performance.
One thing the paper does touch on is the relative difficulty of standing up the
cluster, which has not changed since 0.90.4. I think that's definitely
something that could be improved upon.
- Dave
On Thu, Aug 3
k
[st...@duboce.net]
Enviado: quinta-feira, 30 de agosto de 2012 19:04
Para: user@hbase.apache.org
Assunto: Re: [maybe off-topic?] article: Solving Big Data Challenges for
Enterprise Application Performance Management
On Thu, Aug 30, 2012 at 7:51 AM, Cristofer Weber
wrote:
> About HMasters, yes,
Hi there!
After I started studying HBase, I've searched for open source projects backed
by HBase and I found Titan distributed graph database (you probably heard about
it). As soon as I read in their documentation that HBase adapter is
experimental and suboptimal (disclaimer here:
https://gith
nd of each test, either cleanup
your hbase or use a different "area" per test.
best regards,
ulrich
--
connect on xing or linkedin. sent from my tablet.
On 31.08.2012, at 06:46, Stack wrote:
> On Thu, Aug 30, 2012 at 4:44 PM, Cristofer Weber
> wrote:
>> Hi there!
>
ngleton class + prefixing the table names by a random key (to allow multiple
tests in parallel on the same cluster without relying on
cleanup) + getProperty to decide between starting a mini cluster or connecting
to a cluster.
HTH,
Nicolas
On Fri, Aug 31, 2012 at 12:28 PM, Cristofer Weber &l
TH2,
Ulrich
On Fri, Aug 31, 2012 at 12:28 PM, Cristofer Weber <
cristofer.we...@neogrid.com> wrote:
> Hi Sonal, Stack and Ulrich!
>
> Yes, I should provide more details :$
>
> I reached the links you provided when I was searching for a way to
> start HBase with JUnit. F
20 matches
Mail list logo