Sorry.. Moving 'hbase' mailing list to BCC 'cause this is not related to
HBase. Adding 'hadoop' user group.
On Mon, Feb 11, 2013 at 10:22 AM, Something Something <
mailinglist...@gmail.com> wrote:
> Hello,
>
> We are running into performance issues with
Anybody? Alrighty then.. back to more debugging -:)
On Thu, May 17, 2012 at 5:06 PM, Something Something <
mailinglist...@gmail.com> wrote:
> HBase Version: hbase-0.90.4-cdh3u3
>
> Hadoop Version: hadoop-0.20.2-cdh3u2
>
>
> 12/05/17 16:37:47 ERROR mapreduce.LoadIncrem
te:
> Can you post the complete message ?
>
> What HBase version are you using ?
>
> On Thu, May 17, 2012 at 4:48 PM, Something Something <
> mailinglist...@gmail.com> wrote:
>
> > Hello,
> >
> > I keep getting this message while running the 'completebu
Hello,
I keep getting this message while running the 'completebulkload' process.
I tried the following solutions that I came across while Googling for this
error:
1) setReduceSpeculativeExecution(true)
2) Made sure that none of the tasks are failing.
3) The HFileOutput job runs successfully.
ny reducers and making sure you don't have holes or regions that
> are too big due to the way the keys are partitioned. I was lucky enough to
> not have to go that far.
>
>
> On Thu, May 10, 2012 at 11:55 AM, Something Something <
> mailinglist...@gmail.com> wrote:
g to bulkload anyway (which requires Put or KeyValue values,
> both of which you can get the size from).
>
> On Sun, May 13, 2012 at 2:11 AM, Something Something <
> mailinglist...@gmail.com> wrote:
>
> > Is there no way to find out inside a single redu
alue in a row until the size reached a
> certain limit.
>
> On Sat, May 12, 2012 at 7:21 PM, Something Something <
> mailinglist...@gmail.com> wrote:
>
> > Hello,
> >
> > This is really a MapReduce question, but the output from this will be
> used
&g
or a new region.
>
> Secondary sort is not necessary unless the order of the values matter for
> you. In this case (with the row key as the reducer key), I don't think
> that matters.
>
> On Thu, May 10, 2012 at 3:22 AM, Something Something <
> mailinglist...@gmail.com>
able. There is probably a better way to do this but it
> takes like 20 minutes to write.
>
> This whole process took less than an hour, with the bulk load part only
> taking 15 minutes. Much better!
>
> On Wed, May 9, 2012 at 11:08 AM, Something Something <
> mailinglist.
Avro class to create the Hfiles that I eventually moved into HBase
> with completebulkload. I haven't committed my class anywhere because it's
> a pretty ugly hack, but I'm happy to share it with you as a starting point.
> Doing billions of puts will just drive you crazy.
I ran the following MR job that reads AVRO files & puts them on HBase. The
files have tons of data (billions). We have a fairly decent size cluster.
When I ran this MR job, it brought down HBase. When I commented out the
Puts on HBase, the job completed in 45 seconds (yes that's seconds).
Obvio
to turn it off.
*/
public synchronized void setQuietMode(boolean quietmode) {
this.quietmode = quietmode;
}
Can someone tell me how to force call to this? Apologies in advance for my
dumbness.
On Wed, Mar 7, 2012 at 10:30 PM, Something Something <
mailinglist...@gmail.com> wrot
-- Forwarded message --
From: Something Something
Date: Thu, Mar 8, 2012 at 8:43 AM
Subject: Re: org.apache.hadoop.conf.Configuration - error parsing conf file
To: u...@pig.apache.org, manishbh...@rocketmail.com
*Stack*:
Explicit message would be one that would tell me which
Hello,
I am using: hadoop-0.20.2-cdh3u2, hbase-0.90.4-cdh3u3, pig-0.8.1-cdh3u3
I have successfully loaded data into HBase tables (implying my Hadoop &
HBase setup is good). I can look at the data using HBase shell.
Now I am trying to read data from HBase via a Pig Script. My test script
looks
you want to do with the stored data, that helps.
> the row key, column family and column name structure depends on what is
> your access pattern (both reads and writes) and sorting requirements.
>
> thanks
>
> On Sun, Feb 26, 2012 at 10:24 PM, Something Something <
> mailin
Trying to design a HBase schema for a log processing application. We will
get new logs every day.
1) We are thinking we will keep data for each day in separate tables. The
table names would be something like XYZ-2012-02-26 etc. There will be at
most 4 tables for each day.
Pros:
Other process
Hello,
Our Hadoop cluster is setup on EC2, but our client machine which will
trigger the M/R job is in our data center. I am trying to start a M/R job
from our client machine, but getting this:
00:01:16.885 [pool-6-thread-1] INFO org.apache.hadoop.ipc.Client -
Retrying connect to server:
ec2-xx
By no means I am a Netezza expert, but my manager seems to believe that our
existing Netezza based system can be replaced with a NOSQL (Key/Value) type
of database. If anyone has done Netezza to HBase migration, please share
your experiences.
As always, greatly appreciate the help.
> >From: Arvind Jayaprakash
> >To: user@hbase.apache.org
> >Sent: Thursday, September 8, 2011 2:49 AM
> >Subject: Re: HBase Vs CitrusLeaf?
> >
> >On Sep 06, Something Something wrote:
> >>Anyway, before I spent
I am a HUGE fan of HBase, but our management team wants us to evaluate
CitrusLeaf (http://citrusleaf.net/index.php). I have NO idea why! Our
management claims that CitrusLeaf is (got to be) faster because it's written
in C++. Trying to find if there's any truth to that.
Anyway, before I spent a
Hello,
Need to create a report that shows 'last 100 rows by timestamp'. This query
should return almost instantaneously. Any recommendation regarding the
design? I was thinking of creating a table with 'sequence #' as a key and
value would be 'key of another table that contains the master data'
What's the best way of implementing transaction management in HBase? I have
a use case in which I update multiple tables. If for some reason an update
fails on the 2nd table, I would like to rollback changes to the first
table. A quick Google search got me to this document:
http://hbase.apache.o
Hello,
Are there scripts available to create a HBase cluster on Rackspace - like
there are for Amazon EC2? A quick Google search didn't come up with
anything useful.
Any help in this regard would be greatly appreciated. Thanks.
- Ajay
the quick reply.
On Fri, May 20, 2011 at 2:19 PM, Jean-Daniel Cryans wrote:
> Here's an overview of what you can do
> http://blog.sematext.com/2011/03/11/hbase-backup-options/
>
> J-D
>
> On Fri, May 20, 2011 at 2:18 PM, Something Something
> wrote:
> > Looking fo
Looking for a reliable Backup/Restore solution. Is Cluster Replication (
http://hbase.apache.org/replication.html) the only recommended way? We
don't have extra infrastructure needed at this client for replication. Just
creating a demo/prototype application for them.
Is there a utility that wi
IDs to hand out
> > and in case it dies it gets its next assigned 100 IDs and leaves a
> > small gap behind. That way you can take the pressure of the counter if
> > that is going to be an issue for you. Depends on your insert frequency
> > obviously.
> >
> > Lars
Hello,
Can you please tell me if this is the proper way of designing a table that's
got an auto increment key? If there's a better way please let me know that
as well.
After reading the mail archives, I learned that the best way is to use the
'incrementColumnValue' method of HTable.
So hypothet
sponse.
>> Longest is around 8 seconds. This was my first try at HBase and my next rev.
>> will be much better.
>>
>> -Pete
>>
>> PS At least you could use your name.
>>
>> Something Something wrote:
>>
>> =
>> Is it a
gh data to justify multi machine deployments, perhaps flat
> files?
>
> -ryan
>
> On Thu, Feb 3, 2011 at 2:48 PM, Something Something
> wrote:
> > Is it advisable to use HBase as a backend for a GUI app or is HBase more
> for
> > storing huge amounts of data used
want to
> pre-fetch per RPC. Setting it to 2 is already 2x better than the
> default.
>
> J-D
>
> On Thu, Feb 3, 2011 at 1:35 PM, Something Something
> wrote:
> > After adding the following line:
> >
> > scan.addFamily(Bytes.toBytes("Info"));
> &g
Is it advisable to use HBase as a backend for a GUI app or is HBase more for
storing huge amounts of data used mainly for data analysis in
non-online/batch mode? In other words, after storing data on HBase do most
people extract the summary and store it in a SQL database for quick
retrieval by GUI
Then there will be no impact of the other column families.
>
> > -Original Message-
> > From: Something Something [mailto:mailinglist...@gmail.com]
> > Sent: Thursday, February 03, 2011 11:28 AM
> > To: user@hbase.apache.org
> > Subject: Re: Fastest way to
count += 1
>next unless (block_given? && count % interval == 0)
># Allow command modules to visualize counting process
>yield(count, String.from_java_bytes(row.getRow))
> end
>
> # Return the counter
> return count
>en
.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html
> St.Ack
>
> On Thu, Feb 3, 2011 at 6:01 AM, Something Something
> wrote:
> > I want to read only the keys in a table. I tried this...
> >
> >try {
> >
> > HTable table =
I want to read only the keys in a table. I tried this...
try {
HTable table = new HTable("myTable");
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("Info"));
ResultScanner scanner = table.getScanner(scan);
Result result = scanner.next();
while (result != null) {
& so on...
Stack - Any thoughts on this?
On Mon, Jan 31, 2011 at 6:27 PM, Something Something <
mailinglist...@gmail.com> wrote:
> 1) Version numbers:
>
> hadoop-0.20.2
> hbase-0.20.6
>
>
> 2) autoFlush to 'true' works, but wouldn't that slow down the insertion
s("info"), Bytes.toBytes("code"), Bytes.toBytes(
code));
& so on... and at the end...
table.put(put);
Is this not the right way to do it? Please let me know. Thanks for the
help.
On Sun, Jan 30, 2011 at 3:03 PM, Stack wrote:
> What version of hbase+hadoop
ind out why you're not getting
> that data back if it's supposed to exist.
>
> J-D
>
> On Thu, Jan 20, 2011 at 11:52 PM, Something Something
> wrote:
> > I have a column that looks like this under hbase shell:
> >
> > column=Request:placement, timestamp
I have a column that looks like this under hbase shell:
column=Request:placement, timestamp=1295593730949,
value=specific.ea.tracking.promo.deadspace2
In my code I have something like this...
byte[] value = result.getValue(Bytes.toBytes("Request"),
Bytes.toBytes("placement"));
LOG.info("Pla
39 matches
Mail list logo