question about HTableDescriptor

2011-02-03 Thread Weishung Chung
I am looking at the following protected HTableDescriptor's constructor, but i can't figure out the purpose of the Map values ? What does it contain? protected HTableDescriptor(final byte [] name, HColumnDescriptor[] families, Map values) Thank you,

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Lars George
Hi Stack, I was just asking Todd the same thing, ie. fixed new size vs NewRatio. He and you have done way more on GC debugging than me so I trust whatever Todd or you say. I would leave the UseParNewGC for good measure (not relying on implicit defaults). I also re-read just before I saw your reply

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Stack
Yeah, our wiki page seems way off to me. I can update it. Rather than hardcoding absolute new gen size Todd, how about using -XX:NewRatio=3 say; i.e. 1/4 of heap is new gen (maybe it should be 1/3rd!). Does UseParNewGC do anything? I seem to see the 'parallel' rescans whether its on or off (Thi

Re: Why Random Reads are much slower than the Writes

2011-02-03 Thread Ryan Rawson
Sequential writes vs random reads on disk are always faster. You want caching. Lots of it :) On Feb 3, 2011 10:24 PM, "charan kumar" wrote: > Hello, > > I am using Hbase 0.90.0 with hadoop-append. on a 30 m/c cluster (1950, 2 > CPU, 6 G). > > Writes peak at 5000 per second. But Reads are only at 1

Re: Type mismatch

2011-02-03 Thread Mark Kerzner
I am on 0.89 from CDH3 I tried IdentityTableReducer, but get the same error I will try 0.90. Should I include the HBase code, so that I can step through it? On Fri, Feb 4, 2011 at 12:38 AM, Stack wrote: > I'm not sure whats up w/ your sample above. Here's some observations > that might help. >

Re: Type mismatch

2011-02-03 Thread Stack
I'm not sure whats up w/ your sample above. Here's some observations that might help. Here is the code. Our line numbers differ. You are not on 0.90.0? Thats not important. You are in this method it seems: http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html#12

Re: Queries regarding REST API

2011-02-03 Thread Hari Sreekumar
Hi Andrew, Plain text output would be great for me. Please add it if it isn't too much effort. You mean plain text as in text/plain of plain text within the xml/json? Thanks a lot, Hari On Fri, Feb 4, 2011 at 12:37 AM, Andrew Purtell wrote: > > Thanks guys for the replies. Is there any differe

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
Here you go.. HBase Performance tuning page http://wiki.apache.org/hadoop/Hbase/FAQ#A7refers to the following hadoop URL. http://wiki.apache.org/hadoop/PerformanceTuning Thanks, Charan On Thu, Feb 3, 2011 at 10:22 PM, Todd Lipcon wrote: > Does the wiki really recommend that? Got a link handy

Why Random Reads are much slower than the Writes

2011-02-03 Thread charan kumar
Hello, I am using Hbase 0.90.0 with hadoop-append. on a 30 m/c cluster (1950, 2 CPU, 6 G). Writes peak at 5000 per second. But Reads are only at 1000 QPS. We hash the key for even distribution across regions. Any recommendations/suggestions? Thanks, Charan

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Todd Lipcon
Does the wiki really recommend that? Got a link handy? On Thu, Feb 3, 2011 at 10:20 PM, charan kumar wrote: > Todd, > > That did the trick. I think the wiki should be updated as well, no point > in recommending ParNew 6M or is it? > > Thanks, > Charan. > > On Thu, Feb 3, 2011 at 2:06 PM, Charan

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
Todd, That did the trick. I think the wiki should be updated as well, no point in recommending ParNew 6M or is it? Thanks, Charan. On Thu, Feb 3, 2011 at 2:06 PM, Charan K wrote: > Thanks Todd.. I will try it out .. > > > On Feb 3, 2011, at 1:43 PM, Todd Lipcon wrote: > > > Hi Charan, > >

Re: Type mismatch

2011-02-03 Thread Mark Kerzner
Thank you, St.Ack, it is very nice of you to keep helping me. Here is the stack :) trace, but as you can see, it is the internal Hadoop code. I see this code and I see the message - I am not passing it the right object - but how DO I pass the right object? M at org.apache.hadoop.hbase.ma

Re: Type mismatch

2011-02-03 Thread Stack
Look at the stack trace. See where its being thrown. Look at that src code at that line offset. Should give you a clue. St.Ack On Thu, Feb 3, 2011 at 9:36 PM, Mark Kerzner wrote: > Thank you, that helped, but now I get this error on trying to write back to > HBase: > > java.io.IOException: Pas

Re: Type mismatch

2011-02-03 Thread Mark Kerzner
Thank you, that helped, but now I get this error on trying to write back to HBase: java.io.IOException: Pass a Delete or a Put Here is a fragment on my code. Again, thanks a bunch! public static class RowCounterReducer extends TableReducer { public void reduce(Text k

Re: Exception reading from Hbase table with LZO compression

2011-02-03 Thread Ashish Shinde
hi Todd, Sorry forgot to mention that. I am using your version 0.4.9 build from https://github.com/toddlipcon/hadoop-lzo Thanks and regrads, - Ashish On Thu, 3 Feb 2011 11:25:09 -0800 Todd Lipcon wrote: > Hi Ashish, > > Which version of the LZO libraries are you using? > > -Todd > > O

Re: HBase as a backend for GUI app?

2011-02-03 Thread Ted Dunning
There are no dumb questions, just dumb answers. On Thu, Feb 3, 2011 at 4:27 PM, Something Something < mailinglist...@gmail.com> wrote: > Javamann - I like to be anonymous so I don't feel bad about asking dumb > questions ;) >

Re: HBase as a backend for GUI app?

2011-02-03 Thread Something Something
Thanks everybody for the replies. Sounds very reassuring. We will continue with our plan of accessing HBase directly from GWT for our POC. Javamann - I like to be anonymous so I don't feel bad about asking dumb questions ;) On Thu, Feb 3, 2011 at 3:41 PM, Dani Rayan wrote: > +1 for PS > > > O

Re: HBase as a backend for GUI app?

2011-02-03 Thread Dani Rayan
+1 for PS On Thu, Feb 3, 2011 at 6:32 PM, wrote: > Sounds like we are doing the same thing. I am hitting an HBase backend from > an Ajax frontend (via Spring MVC) to access our log files to generate the > same type of reports. Most of the time I get sub-second response. Longest is > around 8 sec

Re: HBase as a backend for GUI app?

2011-02-03 Thread javamann
Sounds like we are doing the same thing. I am hitting an HBase backend from an Ajax frontend (via Spring MVC) to access our log files to generate the same type of reports. Most of the time I get sub-second response. Longest is around 8 seconds. This was my first try at HBase and my next rev. wil

Re: HBase as a backend for GUI app?

2011-02-03 Thread Ryan Rawson
Well, at SU we access hbase directly from our website, so it is possible. We get great response times to our queries, but that doesnt mean you will get great response times to your queries. Perhaps you should build a POC and see where that goes? -ryan On Thu, Feb 3, 2011 at 3:22 PM, Something S

Re: HBase as a backend for GUI app?

2011-02-03 Thread Something Something
By GUI app, I meant, a browser based application that's written in GWT. For example, let's say an application that allows users to view logs such as, application server logs or 'click tracking logs' etc. These logs are HUGE, so our requirement is to allow users to view a month worth of data - whi

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Something Something
Awesome! It's instantaneous now. Thanks a bunch. Any such tricks for code that looks like this... Get get = new Get(Bytes.toBytes(code)); Result result = table.get(get); NavigableMap map = result.getFamilyMap(Bytes.toBytes("Keys")); if (map != null) { for (Map.En

Re: HBase as a backend for GUI app?

2011-02-03 Thread Ryan Rawson
I think the answer is 'it depends'. What exactly is a "GUI app" anyways these days? The wording is a little vague to me, does that include things like amazon.com and google reader? Or is it limited to things like Firefox, and desktop applications? I think ultimately the only thing that is a must f

Re: Setup Question/Recommendation

2011-02-03 Thread Jean-Daniel Cryans
Inline. J-D >  1. > I posted a question a couple days ago about raid configuration for Hadoop and > the answer is JBOD however, once you setup that up and you are going through > your linux install what volume formatting do you select? ext3/4 lvm? ext4 seems to be the new favorite, before that

HBase as a backend for GUI app?

2011-02-03 Thread Something Something
Is it advisable to use HBase as a backend for a GUI app or is HBase more for storing huge amounts of data used mainly for data analysis in non-online/batch mode? In other words, after storing data on HBase do most people extract the summary and store it in a SQL database for quick retrieval by GUI

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread arv...@cloudera.com
On Thu, Feb 3, 2011 at 2:17 PM, Weishung Chung wrote: > that's awesome !!! > could I go from HBase to Mysql too since I might have to maintain > backward compatibility between the two systems in the process of switch > over > to HBase? > At this time Sqoop supports importing data from external d

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Weishung Chung
Got lotsa reading to do tonite :D On Thu, Feb 3, 2011 at 4:17 PM, Dani Rayan wrote: > Hey, > > Flume: http://archive.cloudera.com/cdh/3/flume/UserGuide.html has a HBase > sink. > Flume is similar to Scribe, but has more to it. > > Original Jira: https://issues.cloudera.org/browse/FLUME-6 > Branc

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Dani Rayan
Hey, Flume: http://archive.cloudera.com/cdh/3/flume/UserGuide.html has a HBase sink. Flume is similar to Scribe, but has more to it. Original Jira: https://issues.cloudera.org/browse/FLUME-6 Branch: https://github.com/cloudera/flume/tree/hbase -Thanks, Dani. http://www.cc.gatech.edu/~iar3/ On T

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Weishung Chung
that's awesome !!! could I go from HBase to Mysql too since I might have to maintain backward compatibility between the two systems in the process of switch over to HBase? On Thu, Feb 3, 2011 at 4:11 PM, arv...@cloudera.com wrote: > On Thu, Feb 3, 2011 at 1:56 PM, Weishung Chung wrote: > > > Abo

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Jean-Daniel Cryans
On the scan, you can setCaching with the number of rows you want to pre-fetch per RPC. Setting it to 2 is already 2x better than the default. J-D On Thu, Feb 3, 2011 at 1:35 PM, Something Something wrote: > After adding the following line: > > scan.addFamily(Bytes.toBytes("Info")); > > performan

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread arv...@cloudera.com
On Thu, Feb 3, 2011 at 1:56 PM, Weishung Chung wrote: > About Sqoop, could I import the data specifically into HBase ? > Yes - Sqoop supports direct imports from external databases into HBase. More details can be found in Sqoop documentation available here

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Charan K
Thanks Todd.. I will try it out .. On Feb 3, 2011, at 1:43 PM, Todd Lipcon wrote: > Hi Charan, > > Your GC settings are way off - 6m newsize will promote way too much to the > oldgen. > > Try this: > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmn256m > -XX:CMSInitiatingOccupancyFraction=70

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Weishung Chung
About Sqoop, could I import the data specifically into HBase ? I know I could write a program to read from mysql and use HBase API to write to HBase. On Thu, Feb 3, 2011 at 3:49 PM, Weishung Chung wrote: > thank you for the clarification :) I am reading about sqoop now... > > > On Thu, Feb 3, 20

Re: .oldlogs Cleanup

2011-02-03 Thread Jean-Daniel Cryans
Sorry, got busy with other stuff and forgot your issue. So unless you are running with replication enabled, ReplicationLogCleaner shouldn't be running. It seems that you have an old hbase-default.xml lying around your classpath. Please update to the latest version of that file. It could be that t

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Weishung Chung
thank you for the clarification :) I am reading about sqoop now... On Thu, Feb 3, 2011 at 3:41 PM, Mark Kerzner wrote: > scribe is a tool for log aggregation, at face value, not mysql > > On Thu, Feb 3, 2011 at 3:37 PM, Weishung Chung wrote: > > > Thank you for all the quick response. This real

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread Todd Lipcon
Hi Charan, Your GC settings are way off - 6m newsize will promote way too much to the oldgen. Try this: -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xmn256m -XX:CMSInitiatingOccupancyFraction=70 -Todd On Thu, Feb 3, 2011 at 12:28 PM, charan kumar wrote: > HI Jonathan, > > Thanks for you quick r

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Mark Kerzner
scribe is a tool for log aggregation, at face value, not mysql On Thu, Feb 3, 2011 at 3:37 PM, Weishung Chung wrote: > Thank you for all the quick response. This really helps me along. How about > facebook's scribe? > > On Thu, Feb 3, 2011 at 3:35 PM, Mark Kerzner > wrote: > > > Overview: > > >

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Weishung Chung
Thank you for all the quick response. This really helps me along. How about facebook's scribe? On Thu, Feb 3, 2011 at 3:35 PM, Mark Kerzner wrote: > Overview: > > http://hadoopinpractice.blogspot.com/2011/01/loading-data-from-mysql-to-hadoop.html > > On Thu, Feb 3, 2011 at 3:28 PM, Ryan Rawson

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Something Something
After adding the following line: scan.addFamily(Bytes.toBytes("Info")); performance improved dramatically (Thank you both!). But now I want it to perform even faster, if possible -:) To read 43 rows, it's taking 2 seconds. Eventually, the 'partner' table may have over 500 entries. I guess, I

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Mark Kerzner
Overview: http://hadoopinpractice.blogspot.com/2011/01/loading-data-from-mysql-to-hadoop.html On Thu, Feb 3, 2011 at 3:28 PM, Ryan Rawson wrote: > ImportTSV? > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/ImportTsv.html > > Also writing a job to read from JDBC and write

Re: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Ryan Rawson
ImportTSV? http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/ImportTsv.html Also writing a job to read from JDBC and write to hbase isnt too bad if your schema isnt too insanely complex. -ryan On Thu, Feb 3, 2011 at 1:23 PM, Buttler, David wrote: > Sqoop? > http://archive.cloud

RE: is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Buttler, David
Sqoop? http://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html -Original Message- From: Weishung Chung [mailto:weish...@gmail.com] Sent: Thursday, February 03, 2011 1:18 PM To: user@hbase.apache.org Subject: is there any tool that facilitate the import of data to hbase I am looking f

is there any tool that facilitate the import of data to hbase

2011-02-03 Thread Weishung Chung
I am looking for tool that allows me to import data from mysql to hbase. Any suggestion? Thank you :)

Re: Type mismatch

2011-02-03 Thread Stack
You are emitting a Text type. Try just passing 'row' to the context, the one passed in to your map. St.Ack On Thu, Feb 3, 2011 at 12:23 PM, Mark Kerzner wrote: > Hi, > > I have this code to read and write to HBase from MR, and it works fine with > 0 reducers, but it gives a type mismatch error w

Re: .oldlogs Cleanup

2011-02-03 Thread Wayne
I am deleting .oldlog files manually now. I am seeing a ton of the errors below. Are these errors due to me manually deleting the .oldlog files or is this the error from the bug explaining why they are not deleted on their own? 2011-02-03 12:07:23,618 ERROR org.apache.hadoop.hbase.master.LogCleane

Re: Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
HI Jonathan, Thanks for you quick reply.. Heap is set to 4G. Following are the JVM opts. export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:NewSize=6m -XX:MaxNewSize=6m" Are there any other options apart from increasing the RAM?

Type mismatch

2011-02-03 Thread Mark Kerzner
Hi, I have this code to read and write to HBase from MR, and it works fine with 0 reducers, but it gives a type mismatch error when with 1 reducer. What should I look at? *Thank you!* *Code:* static class RowCounterMapper extends TableMapper { private static enum Counter

Setup Question/Recommendation

2011-02-03 Thread Joseph Coleman
1. I posted a question a couple days ago about raid configuration for Hadoop and the answer is JBOD however, once you setup that up and you are going through your linux install what volume formatting do you select? ext3/4 lvm? 2. If I am looking at having a 10+ node data/region server cluster

RE: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Jonathan Gray
If you only need to consider a single column family, use Scan.addFamily() on your scanner. Then there will be no impact of the other column families. > -Original Message- > From: Something Something [mailto:mailinglist...@gmail.com] > Sent: Thursday, February 03, 2011 11:28 AM > To: user

RE: Region Servers Crashing during Random Reads

2011-02-03 Thread Jonathan Gray
How much heap are you running on your RegionServers? 6GB of total RAM is on the low end. For high throughput applications, I would recommend at least 6-8GB of heap (so 8+ GB of RAM). > -Original Message- > From: charan kumar [mailto:charan.ku...@gmail.com] > Sent: Thursday, February 03,

Region Servers Crashing during Random Reads

2011-02-03 Thread charan kumar
Hello, I am using hbase 0.90.0 with hadoop-append. h/w ( Dell 1950, 2 CPU, 6 GB RAM) I had 9 Region Servers crash (out of 30) in a span of 30 minutes during a heavy reads. It looks like a GC, ZooKeeper Connection Timeout thingy to me. I did all recommended configuration from the Hbase wiki... An

Re: Fastest way to read only the keys of a HTable?

2011-02-03 Thread Something Something
Hmm.. performance hasn't improved at all. Do you see anything wrong with the following code: public List getPartners() { ArrayList partners = new ArrayList(); try { HTable table = new HTable("partner"); Scan scan = new Scan(); scan.setFilter(new Fir

Re: Exception reading from Hbase table with LZO compression

2011-02-03 Thread Todd Lipcon
Hi Ashish, Which version of the LZO libraries are you using? -Todd On Thu, Feb 3, 2011 at 7:46 AM, Ashish Shinde wrote: > Hi, > > I get the following exception when reading from a table with LZO > compression (using a M/R job as well as shell) on hbase 0.90.0. > > Is this hbase related of is t

Re: Queries regarding REST API

2011-02-03 Thread Andrew Purtell
> Thanks guys for the replies. Is there any difference if I use json > representation (application/json)? Key, column, and value will also be base64 encoded if using JSON representation > Can I get data is plain text with the special characters > escaped in some way? I could add a text/plain rep

RE: doing a scan that will return random columns in a table's family

2011-02-03 Thread Peter Haidinyak
Thanks -Original Message- From: Jonathan Gray [mailto:jg...@fb.com] Sent: Thursday, February 03, 2011 10:19 AM To: user@hbase.apache.org Subject: RE: doing a scan that will return random columns in a table's family Result is just the client-side class which wraps whatever the server ret

RE: doing a scan that will return random columns in a table's family

2011-02-03 Thread Jonathan Gray
Result is just the client-side class which wraps whatever the server returns. The ability to do this query is not really about whether Result has the methods to get at this data, but rather whether Scan supports this type of query (it does). Scan.addFamily(family) will make it so that every co

RE: doing a scan that will return random columns in a table's family

2011-02-03 Thread Peter Haidinyak
Thanks -Original Message- From: Buttler, David [mailto:buttl...@llnl.gov] Sent: Thursday, February 03, 2011 8:53 AM To: user@hbase.apache.org Subject: RE: doing a scan that will return random columns in a table's family By default that is what you get. You do have to navigate through th

RE: doing a scan that will return random columns in a table's family

2011-02-03 Thread Buttler, David
By default that is what you get. You do have to navigate through the results: Result.getFamilyMap(): /** * Map of qualifiers to values. * * Returns a Map of the form: Map * @param family column family to get * @return map of qualifiers to values */ public Na

Exception reading from Hbase table with LZO compression

2011-02-03 Thread Ashish Shinde
Hi, I get the following exception when reading from a table with LZO compression (using a M/R job as well as shell) on hbase 0.90.0. Is this hbase related of is there a big in hadoop-gpl-compression. Thanks and regards, - Ashish java.io.IOException: java.io.IOException: java.lang.IllegalArg

Re: Queries regarding REST API

2011-02-03 Thread Hari Sreekumar
Thanks guys for the replies. Is there any difference if I use json representation (application/json)? Can I get another encoding if I go that way? Can I get data is plain text with the special characters escaped in some way? hari On Wed, Feb 2, 2011 at 1:50 AM, Andrew Purtell wrote: > > The pro

Re: Persist JSON into HBase

2011-02-03 Thread Lars George
Sorry for the late bump... It is quite nice to store JSON as strings in HBase, i.e. use for example JSONObject to convert to something like "{ "name' : "lars" }" and then Bytes.toBytes(jsonString). Since Hive now has a HBase handler you can use Hive and its built in JSON support to query cells lik

doing a scan that will return random columns in a table's family

2011-02-03 Thread Pete Haidinyak
Hi, If I have a table:family where I add new columns with computer generated column name (I won't know what they are to add them to a scan) is it possible to do a scan that returns every column in a row? Thanks -Pete