Re: Loader for small files

2013-02-11 Thread Something Something
Sorry.. Moving 'hbase' mailing list to BCC 'cause this is not related to HBase. Adding 'hadoop' user group. On Mon, Feb 11, 2013 at 10:22 AM, Something Something < mailinglist...@gmail.com> wrote: > Hello, > > We are running into performance issues with

Re: Trailer 'header' is wrong; does the trailer size match content

2012-05-18 Thread Something Something
Anybody? Alrighty then.. back to more debugging -:) On Thu, May 17, 2012 at 5:06 PM, Something Something < mailinglist...@gmail.com> wrote: > HBase Version: hbase-0.90.4-cdh3u3 > > Hadoop Version: hadoop-0.20.2-cdh3u2 > > > 12/05/17 16:37:47 ERROR mapreduce.LoadIncrem

Re: Trailer 'header' is wrong; does the trailer size match content

2012-05-17 Thread Something Something
te: > Can you post the complete message ? > > What HBase version are you using ? > > On Thu, May 17, 2012 at 4:48 PM, Something Something < > mailinglist...@gmail.com> wrote: > > > Hello, > > > > I keep getting this message while running the 'completebu

Trailer 'header' is wrong; does the trailer size match content

2012-05-17 Thread Something Something
Hello, I keep getting this message while running the 'completebulkload' process. I tried the following solutions that I came across while Googling for this error: 1) setReduceSpeculativeExecution(true) 2) Made sure that none of the tasks are failing. 3) The HFileOutput job runs successfully.

Re: HBase Performance Improvements?

2012-05-16 Thread Something Something
ny reducers and making sure you don't have holes or regions that > are too big due to the way the keys are partitioned. I was lucky enough to > not have to go that far. > > > On Thu, May 10, 2012 at 11:55 AM, Something Something < > mailinglist...@gmail.com> wrote:

Re: MR job for creating splits

2012-05-13 Thread Something Something
g to bulkload anyway (which requires Put or KeyValue values, > both of which you can get the size from). > > On Sun, May 13, 2012 at 2:11 AM, Something Something < > mailinglist...@gmail.com> wrote: > > > Is there no way to find out inside a single redu

Re: MR job for creating splits

2012-05-12 Thread Something Something
alue in a row until the size reached a > certain limit. > > On Sat, May 12, 2012 at 7:21 PM, Something Something < > mailinglist...@gmail.com> wrote: > > > Hello, > > > > This is really a MapReduce question, but the output from this will be > used &g

Re: HBase Performance Improvements?

2012-05-10 Thread Something Something
or a new region. > > Secondary sort is not necessary unless the order of the values matter for > you. In this case (with the row key as the reducer key), I don't think > that matters. > > On Thu, May 10, 2012 at 3:22 AM, Something Something < > mailinglist...@gmail.com>

Re: HBase Performance Improvements?

2012-05-10 Thread Something Something
able. There is probably a better way to do this but it > takes like 20 minutes to write. > > This whole process took less than an hour, with the bulk load part only > taking 15 minutes. Much better! > > On Wed, May 9, 2012 at 11:08 AM, Something Something < > mailinglist.

Re: HBase Performance Improvements?

2012-05-09 Thread Something Something
Avro class to create the Hfiles that I eventually moved into HBase > with completebulkload. I haven't committed my class anywhere because it's > a pretty ugly hack, but I'm happy to share it with you as a starting point. > Doing billions of puts will just drive you crazy.

HBase Performance Improvements?

2012-05-09 Thread Something Something
I ran the following MR job that reads AVRO files & puts them on HBase. The files have tons of data (billions). We have a fairly decent size cluster. When I ran this MR job, it brought down HBase. When I commented out the Puts on HBase, the job completed in 45 seconds (yes that's seconds). Obvio

Re: org.apache.hadoop.conf.Configuration - error parsing conf file

2012-03-08 Thread Something Something
to turn it off. */ public synchronized void setQuietMode(boolean quietmode) { this.quietmode = quietmode; } Can someone tell me how to force call to this? Apologies in advance for my dumbness. On Wed, Mar 7, 2012 at 10:30 PM, Something Something < mailinglist...@gmail.com> wrot

Fwd: org.apache.hadoop.conf.Configuration - error parsing conf file

2012-03-08 Thread Something Something
-- Forwarded message -- From: Something Something Date: Thu, Mar 8, 2012 at 8:43 AM Subject: Re: org.apache.hadoop.conf.Configuration - error parsing conf file To: u...@pig.apache.org, manishbh...@rocketmail.com *Stack*: Explicit message would be one that would tell me which

org.apache.hadoop.conf.Configuration - error parsing conf file

2012-03-07 Thread Something Something
Hello, I am using: hadoop-0.20.2-cdh3u2, hbase-0.90.4-cdh3u3, pig-0.8.1-cdh3u3 I have successfully loaded data into HBase tables (implying my Hadoop & HBase setup is good). I can look at the data using HBase shell. Now I am trying to read data from HBase via a Pig Script. My test script looks

Re: Couple of schema design questions

2012-02-26 Thread Something Something
you want to do with the stored data, that helps. > the row key, column family and column name structure depends on what is > your access pattern (both reads and writes) and sorting requirements. > > thanks > > On Sun, Feb 26, 2012 at 10:24 PM, Something Something < > mailin

Couple of schema design questions

2012-02-26 Thread Something Something
Trying to design a HBase schema for a log processing application. We will get new logs every day. 1) We are thinking we will keep data for each day in separate tables. The table names would be something like XYZ-2012-02-26 etc. There will be at most 4 tables for each day. Pros: Other process

Starting Map Reduce Job on EC2

2012-01-15 Thread Something Something
Hello, Our Hadoop cluster is setup on EC2, but our client machine which will trigger the M/R job is in our data center. I am trying to start a M/R job from our client machine, but getting this: 00:01:16.885 [pool-6-thread-1] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: ec2-xx

HBase as a replacement for Netezza?

2011-09-07 Thread Something Something
By no means I am a Netezza expert, but my manager seems to believe that our existing Netezza based system can be replaced with a NOSQL (Key/Value) type of database. If anyone has done Netezza to HBase migration, please share your experiences. As always, greatly appreciate the help.

Re: HBase Vs CitrusLeaf?

2011-09-07 Thread Something Something
> >From: Arvind Jayaprakash > >To: user@hbase.apache.org > >Sent: Thursday, September 8, 2011 2:49 AM > >Subject: Re: HBase Vs CitrusLeaf? > > > >On Sep 06, Something Something wrote: > >>Anyway, before I spent

HBase Vs CitrusLeaf?

2011-09-06 Thread Something Something
I am a HUGE fan of HBase, but our management team wants us to evaluate CitrusLeaf (http://citrusleaf.net/index.php). I have NO idea why! Our management claims that CitrusLeaf is (got to be) faster because it's written in C++. Trying to find if there's any truth to that. Anyway, before I spent a

Retrieving last 100 rows by timestamp

2011-07-26 Thread Something Something
Hello, Need to create a report that shows 'last 100 rows by timestamp'. This query should return almost instantaneously. Any recommendation regarding the design? I was thinking of creating a table with 'sequence #' as a key and value would be 'key of another table that contains the master data'

Transaction Management in HBase?

2011-06-12 Thread Something Something
What's the best way of implementing transaction management in HBase? I have a use case in which I update multiple tables. If for some reason an update fails on the 2nd table, I would like to rollback changes to the first table. A quick Google search got me to this document: http://hbase.apache.o

Starting Hadoop/HBase cluster on Rackspace

2011-05-31 Thread Something Something
Hello, Are there scripts available to create a HBase cluster on Rackspace - like there are for Amazon EC2? A quick Google search didn't come up with anything useful. Any help in this regard would be greatly appreciated. Thanks. - Ajay

Re: Backup/Restore HBase

2011-05-20 Thread Something Something
the quick reply. On Fri, May 20, 2011 at 2:19 PM, Jean-Daniel Cryans wrote: > Here's an overview of what you can do > http://blog.sematext.com/2011/03/11/hbase-backup-options/ > > J-D > > On Fri, May 20, 2011 at 2:18 PM, Something Something > wrote: > > Looking fo

Backup/Restore HBase

2011-05-20 Thread Something Something
Looking for a reliable Backup/Restore solution. Is Cluster Replication ( http://hbase.apache.org/replication.html) the only recommended way? We don't have extra infrastructure needed at this client for replication. Just creating a demo/prototype application for them. Is there a utility that wi

Re: Designing table with auto increment key

2011-02-14 Thread Something Something
IDs to hand out > > and in case it dies it gets its next assigned 100 IDs and leaves a > > small gap behind. That way you can take the pressure of the counter if > > that is going to be an issue for you. Depends on your insert frequency > > obviously. > > > > Lars

Designing table with auto increment key

2011-02-13 Thread Something Something
Hello, Can you please tell me if this is the proper way of designing a table that's got an auto increment key? If there's a better way please let me know that as well. After reading the mail archives, I learned that the best way is to use the 'incrementColumnValue' method of HTable. So hypothet

Re: HBase as a backend for GUI app?

2011-02-03 Thread Something Something
sponse. >> Longest is around 8 seconds. This was my first try at HBase and my next rev. >> will be much better. >> >> -Pete >> >> PS At least you could use your name. >> >> Something Something wrote: >> >> = >> Is it a

Re: HBase as a backend for GUI app?

2011-02-03 Thread Something Something
gh data to justify multi machine deployments, perhaps flat > files? > > -ryan > > On Thu, Feb 3, 2011 at 2:48 PM, Something Something > wrote: > > Is it advisable to use HBase as a backend for a GUI app or is HBase more > for > > storing huge amounts of data used

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Something Something
want to > pre-fetch per RPC. Setting it to 2 is already 2x better than the > default. > > J-D > > On Thu, Feb 3, 2011 at 1:35 PM, Something Something > wrote: > > After adding the following line: > > > > scan.addFamily(Bytes.toBytes("Info")); > &g

HBase as a backend for GUI app?

2011-02-03 Thread Something Something
Is it advisable to use HBase as a backend for a GUI app or is HBase more for storing huge amounts of data used mainly for data analysis in non-online/batch mode? In other words, after storing data on HBase do most people extract the summary and store it in a SQL database for quick retrieval by GUI

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Something Something
Then there will be no impact of the other column families. > > > -Original Message- > > From: Something Something [mailto:mailinglist...@gmail.com] > > Sent: Thursday, February 03, 2011 11:28 AM > > To: user@hbase.apache.org > > Subject: Re: Fastest way to

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Something Something
count += 1 >next unless (block_given? && count % interval == 0) ># Allow command modules to visualize counting process >yield(count, String.from_java_bytes(row.getRow)) > end > > # Return the counter > return count >en

Re: Fastest way to read only the keys of a HTable?

2011-02-02 Thread Something Something
.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html > St.Ack > > On Thu, Feb 3, 2011 at 6:01 AM, Something Something > wrote: > > I want to read only the keys in a table. I tried this... > > > >try { > > > > HTable table =

Fastest way to read only the keys of a HTable?

2011-02-02 Thread Something Something
I want to read only the keys in a table. I tried this... try { HTable table = new HTable("myTable"); Scan scan = new Scan(); scan.addFamily(Bytes.toBytes("Info")); ResultScanner scanner = table.getScanner(scan); Result result = scanner.next(); while (result != null) { & so on...

Re: Tables & rows disappear

2011-02-02 Thread Something Something
Stack - Any thoughts on this? On Mon, Jan 31, 2011 at 6:27 PM, Something Something < mailinglist...@gmail.com> wrote: > 1) Version numbers: > > hadoop-0.20.2 > hbase-0.20.6 > > > 2) autoFlush to 'true' works, but wouldn't that slow down the insertion

Re: Tables & rows disappear

2011-01-31 Thread Something Something
s("info"), Bytes.toBytes("code"), Bytes.toBytes( code)); & so on... and at the end... table.put(put); Is this not the right way to do it? Please let me know. Thanks for the help. On Sun, Jan 30, 2011 at 3:03 PM, Stack wrote: > What version of hbase+hadoop

Re: Bytes.toString(value)) returns empty string

2011-01-21 Thread Something Something
ind out why you're not getting > that data back if it's supposed to exist. > > J-D > > On Thu, Jan 20, 2011 at 11:52 PM, Something Something > wrote: > > I have a column that looks like this under hbase shell: > > > > column=Request:placement, timestamp

Bytes.toString(value)) returns empty string

2011-01-20 Thread Something Something
I have a column that looks like this under hbase shell: column=Request:placement, timestamp=1295593730949, value=specific.ea.tracking.promo.deadspace2 In my code I have something like this... byte[] value = result.getValue(Bytes.toBytes("Request"), Bytes.toBytes("placement")); LOG.info("Pla