Re: convert .META. entries to .regioninfo

2015-10-05 Thread Jack Levin
ou need this? hbck can create > .regioninfo files if missing if thats what you need. > > thanks, > esteban. > > > > -- > Cloudera, Inc. > > > On Mon, Oct 5, 2015 at 1:23 PM, Jack Levin wrote: > >> Hi All, does anyone happen to know or use a script or to

convert .META. entries to .regioninfo

2015-10-05 Thread Jack Levin
Hi All, does anyone happen to know or use a script or tool that can read .META. rows for regioninfo and create .regioninfo file? Thanks. -Jack

Re: What companies are using HBase to serve a customer-facing product?

2014-12-06 Thread Jack Levin
We at Imageshack use Hbase to store all of our images, currently at ~2bl rows with about 350+ TB. Jack On Friday, December 5, 2014, iain wright wrote: > Hi Jeremy, > > pinterest is using it for their feeds: > http://www.slideshare.net/cloudera/case-studies-session-3a > http://www.slideshare.net

Re: unable to delete rows in some regions

2014-04-22 Thread Jack Levin
plitThread: Compaction (major) requested for img611,u4rpx.jpg,1329700235569.cf0a557ff4030c238fc5a6ad732be45f. because User-triggered major compaction; priority=1, compaction queue size=2 However, split never occurred, and data did not get cleaned. -Jack On Tue, Apr 22, 2014 at 10:34 AM, Jack Le

unable to delete rows in some regions

2014-04-22 Thread Jack Levin
Hey All, I was wondering if anyone had this issue with 0.90.5 HBASE. I have a table 'img611', I issue delete of keys like this: hbase(main):004:0> describe 'img611' DESCRIPTION ENABLED {NAME => 'img611', FAMILIES => [{NAME => 'att', BLOOMFI

Re: Question about dead datanode

2014-02-27 Thread Jack Levin
Is this related to JIRA HDFS-378? On Wed, Feb 26, 2014 at 11:54 AM, Jack Levin wrote: > Submitted JIRA patch: https://issues.apache.org/jira/browse/HDFS-6022 > (with test) > > On Mon, Feb 24, 2014 at 12:16 PM, Jack Levin wrote: >> I will do that. >> >> -Jack >

Re: Question about dead datanode

2014-02-26 Thread Jack Levin
Submitted JIRA patch: https://issues.apache.org/jira/browse/HDFS-6022 (with test) On Mon, Feb 24, 2014 at 12:16 PM, Jack Levin wrote: > I will do that. > > -Jack > > On Mon, Feb 24, 2014 at 6:23 AM, Steve Loughran > wrote: >> that's a very old version of cloudera&

Re: Question about dead datanode

2014-02-24 Thread Jack Levin
; > On 19 February 2014 04:48, Stack wrote: > >> On Sat, Feb 15, 2014 at 8:01 PM, Jack Levin wrote: >> >> > Looks like I patched it in DFSClient.java, here is the patch: >> > https://gist.github.com/anonymous/9028934 >> > >> > >> >&

Re: Question about dead datanode

2014-02-23 Thread Jack Levin
I can submit Jira for this if you feel that's appropriate On Feb 18, 2014 8:49 PM, "Stack" wrote: > On Sat, Feb 15, 2014 at 8:01 PM, Jack Levin wrote: > > > Looks like I patched it in DFSClient.java, here is the patch: > > https://gist.github.com/anonymous/902

Re: Question about dead datanode

2014-02-15 Thread Jack Levin
FSClient: Found bestNode:: 10.101.5.5:50010 2014-02-15 19:47:05,686 INFO org.apache.hadoop.hdfs.DFSClient: Datanode available for block: 10.101.5.5:50010 -Jack On Fri, Feb 14, 2014 at 10:16 AM, Jack Levin wrote: > I found the code path that does not work, patched it. Will report if it > fix

Re: Question about dead datanode

2014-02-14 Thread Jack Levin
I found the code path that does not work, patched it. Will report if it fixes the problem On Feb 14, 2014 8:19 AM, "Jack Levin" wrote: > 0.20.2-cdh3u2 -- > > "add to deadNodes and continue" would solve this issue. For some reason > its not getting into this c

Re: Question about dead datanode

2014-02-14 Thread Jack Levin
k On Thu, Feb 13, 2014 at 10:55 PM, Stack wrote: > On Thu, Feb 13, 2014 at 9:18 PM, Jack Levin wrote: > > > One other question, we get this: > > > > 2014-02-13 02:46:12,768 WARN org.apache.hadoop.hdfs.DFSClient: Failed to > > connect to /10

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
lient: Failed to connect to / 10.103.8.109:50010, add to deadNodes and continue "add to deadNodes and continue" specifically? -Jack On Thu, Feb 13, 2014 at 8:55 PM, Jack Levin wrote: > I meant to say, I can't upgrade now, its a petabyte storage system. A > little hard to kee

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
I meant to say, I can't upgrade now, its a petabyte storage system. A little hard to keep a copy of something like that. On Thu, Feb 13, 2014 at 3:20 PM, Jack Levin wrote: > Can upgrade now but I would take suggestions on how to deal with this > On Feb 13, 2014 2:02 PM, &qu

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
Can upgrade now but I would take suggestions on how to deal with this On Feb 13, 2014 2:02 PM, "Stack" wrote: > Can you upgrade Jack? This stuff is better in later versions (dfsclient > keeps running list of bad datanodes...) > St.Ack > > > On Thu, Feb 13, 2014 a

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
13, 2014 at 1:41 PM, Jack Levin wrote: > As far as I can tell I am hitting this issue: > > > http://grepcode.com/search/usages?type=method&id=repository.cloudera.com%24content%24repositories%24releases@com.cloudera.hadoop%24hadoop-core@0.20.2-320@org%24apache%24hadoop%24hdfs

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
Stack wrote: > RS opens files and then keeps them open as long as the RS is alive. We're > failing read of this replica and then we succeed getting the block > elsewhere? You get that exception every time? What hadoop version Jack? > You have short-circuit reads on? > St.

Re: Question about dead datanode

2014-02-13 Thread Jack Levin
I meant its in the 'dead' list on HDFS namenode page. Hadoop fsck / shows no issues. On Thu, Feb 13, 2014 at 10:38 AM, Jack Levin wrote: > Good morning -- > I had a question, we have had a datanode go down, and its been down for > few days, however hbase is trying t

Question about dead datanode

2014-02-13 Thread Jack Levin
Good morning -- I had a question, we have had a datanode go down, and its been down for few days, however hbase is trying to talk to that dead datanode still 2014-02-13 08:57:23,073 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.101.5.5:50010 for file /hbase/img39/6388c3574c32c40

Re: Storing images in Hbase

2013-01-28 Thread Jack Levin
I've never tried it, HBASE worked out nicely for this task, caching and all is a bonus for files. -jack On Mon, Jan 28, 2013 at 2:01 AM, Adrien Mogenet wrote: > Could HCatalog be an option ? > Le 26 janv. 2013 21:56, "Jack Levin" a écrit : >> >> AFAIK, na

Re: Storing images in Hbase

2013-01-27 Thread Jack Levin
store, or we can rent our own cluster with Restful API. If anyone's interested, ping me off the list please. Thanks. -Jack On Sun, Jan 27, 2013 at 8:06 PM, Jack Levin wrote: > We store image/media data into second hbase cluster, but I don't see a > reason why it would not work

Re: Storing images in Hbase

2013-01-27 Thread Jack Levin
hould learn it. > > Also, do you store meta data of each video clip directly in HDFS or you > have other storage like memcache? > > thanks and regards, > > Yiyu > > > On Sun, Jan 27, 2013 at 11:56 AM, Jack Levin wrote: > >> We did some experiments, open source p

Re: Storing images in Hbase

2013-01-27 Thread Jack Levin
> thanks and regards, > > Yiyu > > On Sat, Jan 26, 2013 at 9:56 PM, Jack Levin wrote: > >> AFAIK, namenode would not like tracking 20 billion small files :) >> >> -jack >> >> On Sat, Jan 26, 2013 at 6:00 PM, S Ahmed wrote: >> > That's

Re: Storing images in Hbase

2013-01-26 Thread Jack Levin
AFAIK, namenode would not like tracking 20 billion small files :) -jack On Sat, Jan 26, 2013 at 6:00 PM, S Ahmed wrote: > That's pretty amazing. > > What I am confused is, why did you go with hbase and not just straight into > hdfs? > > > > > On Fri, Jan 25, 20

Re: 答复: GC pause issues

2013-01-25 Thread Jack Levin
s enabled (index.cacheonwrite and > bloom.cacheonwrite) to cache the index/bloom blocks upon hfile writes > though i find it unlikely that they could be impacting my setup. > > On Thu, Jan 24, 2013 at 11:46 PM, Jack Levin wrote: > >> Generally, the larger the flush to harder the GC will work.

Re: 答复: GC pause issues

2013-01-24 Thread Jack Levin
Generally, the larger the flush to harder the GC will work. Flush more often to avoid this. What is your total heap size set at? On Jan 24, 2013 9:02 PM, "Varun Sharma" wrote: > I do have significant block cache churn and this issue is typical > correlated with a huge increase in read latencies -

Re: Storing images in Hbase

2013-01-24 Thread Jack Levin
; > On Mon, Jan 21, 2013 at 5:10 PM, Varun Sharma > wrote: > > > Thanks for the useful information. I wonder why you use only 5G heap > when > > > you have an 8G machine ? Is there a reason to not use all of it (the > > > DataNode typ

Re: Storing images in Hbase

2013-01-23 Thread Jack Levin
re a reason to not use all of it (the > DataNode typically takes a 1G of RAM) > > On Sun, Jan 20, 2013 at 11:49 AM, Jack Levin wrote: > >> I forgot to mention that I also have this setup: >> >> >> hbase.hregion.memstore.flush.size >> 33554432 &g

Re: Storing images in Hbase

2013-01-20 Thread Jack Levin
would mean close to 1G, have you seen any > issues with flushes taking too long ? > > Thanks > Varun > > On Sun, Jan 13, 2013 at 8:17 AM, Jack Levin wrote: > >> That's right, Memstore size , not flush size is increased. Filesize is >> 10G. Overall write cache is 6

RE: Storing images in Hbase

2013-01-13 Thread Jack Levin
e the memstore size: hbase.hregion.max.filesize hbase.hregion.memstore.flush.size On Fri, Jan 11, 2013 at 9:47 AM, Jack Levin wrote: > We buffer all accesses to HBASE with Varnish SSD based caching layer. > So the impact for reads is negligible. We have 70 node cluster, 8 GB > o

Re: Storing images in Hbase

2013-01-11 Thread Jack Levin
http://img338.imageshack.us/img338/6831/screenshot20130111at949.png this shows how often we flush, and how large are the region files. We do have bloomfilters turn up, that we don't incur extra seeks across multiple RS files. -Jack On Fri, Jan 11, 2013 at 9:47 AM, Jack Levin wrote:

Re: Storing images in Hbase

2013-01-11 Thread Jack Levin
it wil lead to some issue. >> > > >> > > >> > > >> > > ∞ >> > > Shashwat Shriparv >> > > >> > > >> > > >> > > On Fri, Jan 11, 2013 at 9:21 AM, lars hofhansl >> wrote: >> > >

Re: Storing images in Hbase

2013-01-10 Thread Jack Levin
We stored about 1 billion images into hbase with file size up to 10MB. Its been running for close to 2 years without issues and serves delivery of images for Yfrog and ImageShack. If you have any questions about the setup, I would be glad to answer them. -Jack On Sun, Jan 6, 2013 at 1:09 PM, Mo

Re: Region number and allocation advice

2012-07-06 Thread Jack Levin
Please note that if you have fast growing datastore, you may end up with very large region files - if you limit the number of regions. If that happens (and you can tell by simply examining your HDFS), your compactions (which you can't avoid) will end up rewriting a lot of data. In our case (we ha

Re: Storing extremely large size file

2012-04-17 Thread Jack Levin
Whats wrong with that size? We store > 15MB routinely into our image hbase. -Jack On Tue, Apr 17, 2012 at 10:46 AM, Jean-Daniel Cryans wrote: > Make sure the config is changed client-side not server-side. > > Also you might not want to store 12MB values in HBase. > > J-D > > On Tue, Apr 17, 201

Re: Speeding up HBase read response

2012-04-09 Thread Jack Levin
; > Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz > avgqu-sz   await  svctm  %util > xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    61.74 > 16.07   55.72   3.39  91.30 > > > Jack Levin wrote: >> >> Please email

Re: Speeding up HBase read response

2012-04-06 Thread Jack Levin
Please email iostat -xdm 1, run for one minute during load on each node -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. ijanitran wrote: I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon XLarge instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated f

Re: ideas for hbasecon 2012

2012-02-23 Thread Jack Levin
I would do a panel session on the subject if there is interest. -Jack On Thu, Feb 23, 2012 at 11:50 AM, Andrew Purtell wrote: > As a disclaimer, I'm on the program committee for HBC2012 but this is > strictly my personal opinion. > > I think we should steer clear of "HBase vs. Cassandra" becaus

Re: Hbase Images

2012-01-17 Thread Jack Levin
images (jpgs) are bytes, there is no difference, you just need to add appropriate http headers using nginx or any other proxy of choice and put it on top of REST HBASE api. -Jack On Tue, Jan 17, 2012 at 10:11 AM, shashwat shriparv wrote: > You can not store image as such rather you need to conve

Re: Meta region hotspotting

2011-12-26 Thread Jack Levin
Some time ago, we had a situation where our REST server was slammed with queries that did not find any matches for rows in Hbase. When that happened we sustained 50k rpc/sec to META region server as reported by the master web page. After digging deeper we found that reach request with 'wrong' url

Re: holes in tables.

2011-12-26 Thread Jack Levin
ASE-4377 >> >> But maybe you're looking for an online solution. >> >> On Sun, Dec 25, 2011 at 7:15 PM, Jack Levin wrote: >> >>> Yep >>> >>> -Jack >>> >>> >>> On Dec 25, 2011, at 4:39 PM, Ted Yu wrote: >>>

Re: holes in tables.

2011-12-25 Thread Jack Levin
Yep -Jack On Dec 25, 2011, at 4:39 PM, Ted Yu wrote: > Which version of HBase ? > I guess 0.90.4 ? > > Cheers > > On Sun, Dec 25, 2011 at 3:55 PM, Jack Levin wrote: > >> 3. enable 'img644' >> >> >> Does not solve the problem. &g

Re: holes in tables.

2011-12-25 Thread Jack Levin
3. enable 'img644' Does not solve the problem. -Jack On Sun, Dec 25, 2011 at 3:54 PM, Jack Levin wrote: > Greetings all.  How does one deals with holes in tables between > regions nowadays? > > > Name    Region Server   Start K

holes in tables.

2011-12-25 Thread Jack Levin
Greetings all. How does one deals with holes in tables between regions nowadays? NameRegion Server Start Key End Key img644,,1317474152909.02f379ab6f08f4d7609ef1245cb7033a. not deployed 1cce.jpg img644,1cce.jpg,1317474152909.ebb8778fc1e67965c518e357125678ea. not deployed

hbase sandbox at ImageShack.

2011-12-01 Thread Jack Levin
Hello All. I've setup an hbase (0.90.4) sandbox running on servers where we have some excess capacity. Feel free to play with it, e.g. create tables, run load tests, benchmarks, essentially do whatever you want, just don't put your production services there, because while we do have it up due to

Re: errors after upgrade

2011-11-14 Thread Jack Levin
Nope, there are no timeouts, the queries are fast and 95% in cache, this looks like a region server tried to read some memory buffer and get 0 bytes in return. -Jack On Mon, Nov 14, 2011 at 3:10 PM, Stack wrote: > On Mon, Nov 14, 2011 at 3:05 PM, Jack Levin wrote: >> No custom code.

Re: errors after upgrade

2011-11-14 Thread Jack Levin
On Wed, Nov 9, 2011 at 7:24 PM, Jack Levin wrote: >> Hey guys, I am getting those errors after moving into 0.90.4: >> > > You have custom code on the server-side Jack?  A filter or something? > > You could turn on rpc logging.  It could give you more clues on what > is

Re: errors after upgrade

2011-11-14 Thread Jack Levin
Anyone seen this before? We continue to have this on several of our clusters. Thanks. -Jack On Wed, Nov 9, 2011 at 7:24 PM, Jack Levin wrote: > Hey guys, I am getting those errors after moving into 0.90.4: > > 2011-11-09 19:22:51,220 ERROR > org.apache.hadoop.hbase.io.HbaseOb

errors after upgrade

2011-11-09 Thread Jack Levin
Hey guys, I am getting those errors after moving into 0.90.4: 2011-11-09 19:22:51,220 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: Error in readFields java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.hbase.client.Get.re

Re: Getting EOF Exception when starting HBASE

2011-11-02 Thread Jack Levin
You likely have hadoop-core in hbase/lib dir that's wrong, delete it, and copy one from Hadoop/ dir -Jack On Nov 2, 2011, at 10:14 AM, LoveIR wrote: > > Hi, > > I am using Hbase 0.90.4 version and my Hadoop version is 0.20.203. I am > getting the following exception in my HMaster logs when

Re: [jira] [Updated] (HBASE-4695) WAL logs get deleted before region server can fully flush

2011-10-31 Thread Jack Levin
      URL: https://issues.apache.org/jira/browse/HBASE-4695 >>             Project: HBase >>          Issue Type: Bug >>          Components: wal >>    Affects Versions: 0.90.4 >>            Reporter: jack levin >>            Assignee: gaojinchao >>            Priorit

Re: Region server crash?

2011-09-21 Thread Jack Levin
The master will detect that RS is down by periodically checking a zookeeper ( it will say in the master log, znode expired ). After, it will check to see if there is anything in /hbase/.logs directory for that region server, if something is found, master will replay the log records and 'push' them

Re: prevent region splits?

2011-09-03 Thread Jack Levin
Make hbase.hregion.max.filesize to be very large. Then your regions won't split. We use this method when copying 'live' hbase to make a backup. -Jack On Sat, Sep 3, 2011 at 4:32 PM, Geoff Hendrey wrote: > Is there a way to prevent regions from splitting while we are running a > mapreduce job th

Re: Avatar namenode?

2011-08-18 Thread Jack Levin
I don't think there is anyone except Facebook actually uses it. Their case is special, as they have millions and millions of files in HDFS. -Jack On Thu, Aug 11, 2011 at 11:19 AM, shanmuganathan.r wrote: > > Hi All, > >      I am running the HBase distributed mode in seven node cluster with >

Re: HBase connection pool

2011-07-18 Thread Jack Levin
here is connection pool implementation that works for us: http://pastebin.com/KBv9t7S8 -Jack On Sun, Jul 17, 2011 at 11:16 PM, aadish wrote: > > Hey people, > > I am very new to HBase, and I would like if someone gave me guidance > regarding connection pooling. > > Thanks a lot, in advance. > >

Re: User of FilterList

2011-07-16 Thread Jack Levin
Be mindful that if you are using a scanner with filters, RowKey remains the index of the table, and that filter just filters your results based on how you run your scanner, similarly to "cat file | grep filter", where if "file" is your table and has many lines (rows), your scan might be very ineffi

Re: hbase table as a queue.

2011-07-16 Thread Jack Levin
out from the middle of the table. -Jack On Sat, Jul 16, 2011 at 9:38 AM, Jack Levin wrote: > Hello, we are thinking about using Hbase table as a simple queue which > will dispatch the work for a mapreduce job, as well as real time > fetching of data to present to end user.  In sim

hbase table as a queue.

2011-07-16 Thread Jack Levin
Hello, we are thinking about using Hbase table as a simple queue which will dispatch the work for a mapreduce job, as well as real time fetching of data to present to end user. In simple terms, suppose you had a data source table and a queue table. The queue table has a smaller set of Rows that p

Re: php to thrift vs java api

2011-07-16 Thread Jack Levin
r with (de)serialization. You may want to grab the > latest nightly build of thrift as it has quite a few bug fixes in the php > thrift extension. > > ~Jeff > > On 7/11/2011 11:22 PM, Jack Levin wrote: >> >> For those who are interested, I did some loadtesting

php to thrift vs java api

2011-07-11 Thread Jack Levin
For those who are interested, I did some loadtesting of Puts and Gets speeds using PHP -> Thrift Server -> HBASE, and Java API Client -> HBASE. Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster using Java API client. So I am going to assume that writing near realtime applica

Re: tech. talk at imageshack/yfrog

2011-07-01 Thread Jack Levin
sted to look at your implementation >> (presentation and possibly video). >> >> Ravi >> >> On Thursday, June 30, 2011, Mark Kerzner wrote: >> > Thank you, looks very interesting. >> > >> > Mark >> > >> > On Thu, Jun 30, 201

Re: tech. talk at imageshack/yfrog

2011-06-30 Thread Jack Levin
http://www.slideshare.net/jacque74/hug-hbase-presentation My presentation posted on slideshare, from todays talk. FYI. Best, -Jack On Fri, Jun 10, 2011 at 3:11 PM, Jack Levin wrote: > Yep.  I'd do HUG, its probably larger building/room anyway :). > > -Jack > > On Fri, Jun

Re: tech. talk at imageshack/yfrog

2011-06-10 Thread Jack Levin
Yep. I'd do HUG, its probably larger building/room anyway :). -Jack On Fri, Jun 10, 2011 at 11:39 AM, Jean-Daniel Cryans wrote: > Would you like to transform this into a HUG? See > http://www.meetup.com/hbaseusergroup/ > > J-D > > On Wed, Jun 8, 2011 at 12:07 AM, Jack Le

tech. talk at imageshack/yfrog

2011-06-07 Thread Jack Levin
Hey Guys, I plan to do a tech talk here at ImageShack, on how we store and serve about 200ml images from HBASE. The stats of our are: 60 Region Servers running HBASE Configured Capacity : 517.44 TB DFS Used : 286.93 TB Table count: 1,000 Average file size: 500KB I am doi

Re: exporting from hbase as text (tsv)

2011-06-07 Thread Jack Levin
rote: >> On Tue, Jun 7, 2011 at 10:20 AM, Jack Levin wrote: >>> That would be a real nice feature though, imagine going to the shell, >>> and requesting a dump of your table. >>> >> >> But it would only work for mickey mouse tables, no? If your >

Re: exporting from hbase as text (tsv)

2011-06-07 Thread Jack Levin
That would be a real nice feature though, imagine going to the shell, and requesting a dump of your table. > load into outfile/hdfs '/tmp/table_a' from table_a Or something similar. > load from outfile/hdfs -- could power bulk uploader -Jack On Mon, Jun 6, 2011 at 9:09 PM, J

Re: Hbase Hardware requirement

2011-06-07 Thread Jack Levin
Depends on the load. We have huge cluster running, 4 x 2 TB disks, Core 2 Duo 2.5 Ghz, 8 GB RAM, with 60 nodes, using it mostly for binary cold storage of photos, with very low access rates, and moderate write rates. Second cluster, is Core i7 Quad (hyperthreaded) 3.0Ghz , with 16GB RAM, 4x2TB dr

Re: exporting from hbase as text (tsv)

2011-06-06 Thread Jack Levin
> Can you hook hive to hbase? Yes, we used hbase to hive and back before, but its not real flexible, especially going hbase -> hive route. Much better prefer bulk uploader tool for modified tables via hive map-reduce of tsv or csv. -Jack

Re: exporting from hbase as text (tsv)

2011-06-06 Thread Jack Levin
to now the names of your column families, but besides that it > could be done fairly generically. > > > > On Mon, Jun 6, 2011 at 3:57 PM, Jack Levin wrote: > >> Hello, does anyone have any tools you could share that would take a >> table, and dump the contents as TSV tex

Re: mslab enabled jvm crash

2011-06-06 Thread Jack Levin
We have two production clusters, and we don't on either. We also have days and days worth of no CMF reported. Here is my config that works great for us: export HBASE_OPTS="$HBASE_OPTS -XX:+UseConcMarkSweepGC -XX:MaxDirectMemorySize=2G" # Uncomment below to enable java garbage collection logging.

exporting from hbase as text (tsv)

2011-06-06 Thread Jack Levin
Hello, does anyone have any tools you could share that would take a table, and dump the contents as TSV text format? We want it in tsv for quick HIVE processing that we have in the another datamining cluster. We do not want to write custom map-reduce jobs for hbase because we already have an exte

Re: feature request (count)

2011-06-03 Thread Jack Levin
lumn. Keeping > track of the row count as new rows are created is also not as easy as > it seems - this is because a Put does not know if a row already exists > or not.  Making it aware of that fact would require doing a get before > a put - not cheap. > > -ryan > > On Fri

feature request (count)

2011-06-03 Thread Jack Levin
I have a feature request: There should be a native function called 'count', that produces count of rows based on specific family filter, that is internal to HBASE and won't be required to read CELLs off the disk/cache. Just count up the rows in the most efficient way possible. I realize that fam

Re: ANN: HBase 0.90.3 available for download

2011-05-31 Thread Jack Levin
Hello, is there a git repo URL I could use to check out that code version? -Jack On Thu, May 19, 2011 at 2:35 PM, Stack wrote: > The Apache HBase team is happy to announce that HBase 0.90.3 is > available from the Apache mirror of choice: > >  http://www.apache.org/dyn/closer.cgi/hbase/ > > HBas

Re: mslab enabled jvm crash

2011-05-26 Thread Jack Levin
It might sound crazy, but if you have plenty of CPU, consider lowering your NewSize to like 30MB, if you do that your ParNews will be more frequent, but hitting CMS failure will be less likely, this is what we seen. -Jack On Thu, May 26, 2011 at 10:51 AM, Jack Levin wrote: > Wayne, we get

Re: mslab enabled jvm crash

2011-05-26 Thread Jack Levin
Wayne, we get CMS failures also, I am pretty sure they are fragmentation related: 2011-05-26T09:20:00.304-0700: 206371.599: [GC 206371.599: [ParNew (promotion failed): 76633K->76023K(76672K), 0.0924180 secs]206371.692: [CMS: 11452308K->7142504K(122 02816K), 13.5870310 secs] 11525447K->7142504K(122

Re: mslab enabled jvm crash

2011-05-26 Thread Jack Levin
Wayne, I think you are hitting fragmentation, how often do you flush? Can you share memstore flush graphs? Here is ours: http://img851.yfrog.com/img851/9814/screenshot20110526at124.png We run at 12G Heap, 20% memstore size, 50% blockcache, have recently added incremental mode to combat too frequ

Re: 0.90.3

2011-05-24 Thread Jack Levin
at 7:03 PM, Jack Levin wrote: > "HBase uses the local hostname to self-report it's IP address." > > using 'hostname' as authoritative name for regionserver is what caused > all of the confusion, hostname usually not governed by name resolution > (/etc/ho

Re: 0.90.3

2011-05-24 Thread Jack Levin
27;s all done in HBase which in > turn stores it in ZK. > > Also http://hbase.apache.org/book.html#dns > > J-D > > On Tue, May 24, 2011 at 4:37 PM, Jack Levin wrote: >> figured it out... the /etc/hosts file has ip to name, was used by >> zookeeper was *.prod.imageshac

Re: 0.90.3

2011-05-24 Thread Jack Levin
Then I recommend scratching hostname use in leu of reverse lookup only -Jack On May 24, 2011, at 5:45 PM, Andrew Purtell wrote: >> From: Jack Levin >> figured it out... the /etc/hosts file has ip to name, was used by >> zookeeper was *.prod.imageshack.com,

Re: 0.90.3

2011-05-24 Thread Jack Levin
... it gotta be consistent, otherwise aliases end up screwing things up and people will end up guessing why things don't work. -Jack On Tue, May 24, 2011 at 4:04 PM, Jack Levin wrote: > img645.prod.imageshack.us and img645.imageshack.us are both point to > the same IP. > > -Jack &

Re: 0.90.3

2011-05-24 Thread Jack Levin
img645.prod.imageshack.us and img645.imageshack.us are both point to the same IP. -Jack On Tue, May 24, 2011 at 3:50 PM, Jack Levin wrote: > looks like our balancer is on: > > hbase(main):001:0> balance_switch true > true > 0 row(s) in 0.3700 seconds > > I simply kill

Re: 0.90.3

2011-05-24 Thread Jack Levin
e book for > more details: > http://hbase.apache.org/book.html#decommission > > Dave > > On Tue, May 24, 2011 at 3:33 PM, Jack Levin wrote: > >> just put new hbase version on our test cluster. and been testing it... >> so far if I shutdown an RS, master does not reas

0.90.3

2011-05-24 Thread Jack Levin
just put new hbase version on our test cluster. and been testing it... so far if I shutdown an RS, master does not reassign its regions, and we remain inconsistent forerver, likewise when new RS is up, it does not get regions assigned to it, this is the master log: 2011-05-24 15:30:57,724 DEBUG o

Re: hbase master retries to RS/DN

2011-05-21 Thread Jack Levin
: > Are you running at INFO level logging Jack?  Can you pastebin more log > context.  I'd like to take a look. > Thanks, > St.Ack > > On Thu, May 19, 2011 at 11:36 PM, Jack Levin wrote: >> Thanks, now with setting that value to "2", we still get slow DN death &g

Re: hbase master retries to RS/DN

2011-05-19 Thread Jack Levin
he/hadoop/blob/branch-0.20/src/core/org/apache/hadoop/ipc/Client.java#L631 > > J-D > > On Thu, May 19, 2011 at 11:46 AM, Jack Levin wrote: >> Hello, we have a situation when when RS/DN crashes hard, master is >> very slow to recover, we notice that it waits on these log line

Re: GC and High CPU

2011-05-19 Thread Jack Levin
king about GC black magic) > >> From: Jack Levin >> Subject: Re: GC and High CPU >> To: user@hbase.apache.org >> Date: Monday, May 16, 2011, 5:06 PM >> >> Those are the lines I added: >> >> -XX:+CMSIncrementalMode \  < >> -XX:+CMSIncr

Re: Performance degrades on moving from desktop to blade environment

2011-05-19 Thread Jack Levin
exception : yes >> cpuid level : 11 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat >> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm >> constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_

hbase master retries to RS/DN

2011-05-19 Thread Jack Levin
Hello, we have a situation when when RS/DN crashes hard, master is very slow to recover, we notice that it waits on these log lines: 2011-05-19 11:20:57,766 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.103.7.22:50020. Already tried 0 time(s). 2011-05-19 11:20:58,767 INFO org.a

Re: GC and High CPU

2011-05-17 Thread Jack Levin
com/technetwork/java/gc-tuning-5-138395.html#1.1.Sizing%20the%20Generations%7Coutline >> >have >> a ratio of 1/3 between young/old, so that would be 1gb young, but that >> is probably bigger than you need.  I'm far from an expert though... what >> size do other peopl

Re: GC and High CPU

2011-05-16 Thread Jack Levin
hough the > difference is plain from the GC logs you show. > > See below: > > On Mon, May 16, 2011 at 5:06 PM, Jack Levin wrote: >> Those are the lines I added: >> >> -XX:+CMSIncrementalMode \ < > > > From the doc., it says about i-cms and ja

Re: GC and High CPU

2011-05-16 Thread Jack Levin
increase the chance of stop-the-world GC and should be avoided. > > - Andy > (who always gets nervous when we start talking about GC black magic) > >> From: Jack Levin >> Subject: Re: GC and High CPU >> To: user@hbase.apache.org >> Date: Monday, May 16,

Re: GC and High CPU

2011-05-16 Thread Jack Levin
> default GC previous? > > ParNew is much bigger now. > > St.Ack > > On Mon, May 16, 2011 at 4:11 PM, Jack Levin wrote: >> I think this will resolve my issue, here is the output: >> >>     14 2011-05-16T15:58 >>     13 2011-05-16T15:59 >>     12 2011-05

Re: GC and High CPU

2011-05-16 Thread Jack Levin
yFraction=70 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError -Xloggc:$HBASE_HOME/logs/gc-hbase.log \ -XX:+CMSIncrementalMode \ < -XX:+CMSIncrementalPacing \ <--- -XX:-TraceClassUnloading <--- This way GC statistically adapts, and lesses the

Re: GC and High CPU

2011-05-16 Thread Jack Levin
, Stack wrote: > On Mon, May 16, 2011 at 1:36 PM, Jack Levin wrote: >> How do you tell?  This is the log entries when we had 100% cpu: >> >> 2011-05-14T15:48:58.240-0700: 5128.407: [GC 5128.407: [ParNew: >> 17723K->780K(19136K), 0.0199350 secs] 4309804K->4292973K(577

Re: GC and High CPU

2011-05-16 Thread Jack Levin
t;970K(19136K), 0.0145400 secs] 4312132K->4295460K(5777060K), 0.0146530 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] -Jack On Mon, May 16, 2011 at 12:44 PM, Stack wrote: > What is the size of your new gen?  Is it growing?  Does it level off? > St.Ack > > On Mon, May 16, 201

Re: Performance degrades on moving from desktop to blade environment

2011-05-16 Thread Jack Levin
What is the clock rate of your CPUs (desktop vs blade)? -Jack On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary wrote: > Yes, it is only the HW that was changed . All the configurations are kept at > default from the cloudera installer. > > The regionserver logs semms ok. > > On Mon, May 16, 201

Re: GC and High CPU

2011-05-16 Thread Jack Levin
at 12:02 PM, Stack wrote: > On Sun, May 15, 2011 at 5:37 PM, Jack Levin wrote: >> I've added occupancy:  export HBASE_OPTS="$HBASE_OPTS -verbose:gc >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails >> -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError

Re: Performance degrades on moving from desktop to blade environment

2011-05-16 Thread Jack Levin
We had issues of moving into 32 core AMD box also. The issue was revolving around datanode getting slow after about 12 hours. What you need to do is check fsreadlatency_ave_time graph, if it appears spiky then you have a problem with IO, next get a graph of "Runnable Threads" they should be flat

Re: Pagination through families / columns?

2011-05-16 Thread Jack Levin
When we change versions to 1 from 3 on hbase table schema, things appear work right. -Jack On Mon, May 16, 2011 at 12:14 PM, Jean-Daniel Cryans wrote: > I doesn't look like you are doing something wrong, also I looked at > the unit tests and they seem to cover the basic usage of > ColumnPaginati

  1   2   3   >