Re: HBase Scan consumes high cpu

2019-09-18 Thread Manjeet Singh
gt;> > > > > > > Hi>>> > > > > > > When you did a put with a lower qualifier int (put 'mytable',>>> > > > > > > 'MY_ROW', "pcf:\x0A", "\x00") the system flow is getting a > valid

Re: HBase Scan consumes high cpu

2019-09-18 Thread Solvannan R M
gt; > doing>>> > > > > > a seek which just avoids all the in between deletes and puts>> > > > > processing..>>> > > > > > In 1st case the Filter wont get into action at all unless the> > > scan flow>>> > >

Re: HBase Scan consumes high cpu

2019-09-16 Thread ramkrishna vasudevan
et part itself along with the column range filter. I mean>> > > > >> > > > > get 'mytable', 'MY_ROW', {COLUMN=>['pcf: *1499000 * '],>> > > > > > FILTER=>ColumnRangeFilter.new(Bytes.toBytes(1499000.to_java(:int

Re: HBase Scan consumes high cpu

2019-09-16 Thread Solvannan R M
es.toBytes(1499000.to_java(:int)),>> > > > true, Bytes.toBytes(1499010.to_java(:int)), false)}>> > > >> > > > Pardon the syntax it might not be proper for the shell.. Can this be> > > done?>> > > > This will make the scan to make

Re: HBase Scan consumes high cpu

2019-09-13 Thread ramkrishna vasudevan
java(:int)), false)}> > > > > Pardon the syntax it might not be proper for the shell.. Can this be > done?> > > This will make the scan to make a seek to the given qualifier at 1st > step> > > itself.> > > > > Anoop> > > > > On

Re: HBase Scan consumes high cpu

2019-09-13 Thread Solvannan R M
ke the scan to make a seek to the given qualifier at 1st step> > itself.> > > Anoop> > > On Thu, Sep 12, 2019 at 10:18 PM Udai Bhan Kashyap (BLOOMBERG/ PRINCETON) <> > ukashy...@bloomberg.net> wrote:> > > > Are you keeping the deleted cells? Chec

Re: HBase Scan consumes high cpu

2019-09-12 Thread Anoop John
s? Check 'VERSIONS' for the column family > and set it to 1 if you don't want to keep the deleted cells. > > From: user@hbase.apache.org At: 09/12/19 12:40:01To: > user@hbase.apache.org > Subject: Re: HBase Scan consumes high cpu > > Hi, > > As said

Re: HBase Scan consumes high cpu

2019-09-12 Thread Udai Bhan Kashyap (BLOOMBERG/ PRINCETON)
Are you keeping the deleted cells? Check 'VERSIONS' for the column family and set it to 1 if you don't want to keep the deleted cells. From: user@hbase.apache.org At: 09/12/19 12:40:01To: user@hbase.apache.org Subject: Re: HBase Scan consumes high cpu Hi, As said earlier, we

Re: HBase Scan consumes high cpu

2019-09-12 Thread Solvannan R M
Hi, As said earlier, we have populated the rowkey "MY_ROW" with integers from 0 to 150 as column qualifiers. Then we have deleted the qualifiers from 0 to 1499000. We executed the following query. It took 15.3750 seconds to execute. hbase(main):057:0> get 'mytable', 'MY_ROW', {COLUMN=>['

Re: HBase Scan consumes high cpu

2019-09-10 Thread Josh Elser
Deletes are held in memory. They represent data you have to traverse until that data is flushed out to disk. When you write a new cell with a qualifier of 10, that sorts, lexicographically, "early" with respect to the other qualifiers you've written. By that measure, if you are only scanning f

HBase Scan consumes high cpu

2019-09-10 Thread Solvannan R M
Hi, We have been using HBase (1.4.9) for a case where timeseries data is continuously inserted and deleted (high churn) against a single rowkey. The column keys would represent timestamp more or less. When we scan this data using ColumnRangeFilter for a recent time-range, scanner for the sto

Re: HBase Scan Thread Stuck after Upgrade to 1.4.9

2019-04-23 Thread Toshihiro Suzuki
Maybe you were doing scans with filters that were very heavy and taking a long time. According to the thread dumps, it looks like the 2 scans were not stuck as they were RUNNABLE and I think the scans were just taking a long time. On Thu, Apr 18, 2019 at 9:06 AM Srinidhi Muppalla wrote: > Hello,

HBase Scan Thread Stuck after Upgrade to 1.4.9

2019-04-17 Thread Srinidhi Muppalla
Hello, We recently upgrade our staging cluster running on S3 backed EMR from HBase 1.3.0 to HBase 1.4.9. After doing so and running a test load, we noticed a sudden spike in CPU utilization on one of the nodes in the cluster (jumping from 20% to 60% all at once). After looking at CPU time of th

How to GROUP BY AND AGGREGATE in hbase SCAN using spark/scala.

2018-08-23 Thread neo0731
Hi everyone, When i am trying for group by and aggregate using spark/scala for hbase . I am not able to figure it out how to do it using hbase rdd api for scala. Here is my input and output dataset looks like in hbase I attached the file for input/output data should look like any help on i

Re: HBase scan with setBatch for more than 1 column family

2018-07-02 Thread Ted Yu
Please see the following two constants defined in TableInputFormat : /** Column Family to Scan */ public static final String SCAN_COLUMN_FAMILY = "hbase.mapreduce.scan.column.family"; /** Space delimited list of columns and column families to scan. */ public static final String SCAN_COL

HBase scan with setBatch for more than 1 column family

2018-07-02 Thread revolutionisme
Hi, I am using HBase with Spark and as I have wide columns (> 1) I wanted to use the "setbatch(num)" option to not read all the columns for a row but in batches. I can create a scan and set the batch size I want with TableInputFormat.SCAN_BATCHSIZE, but I am a bit confused how this would wor

Re: hbase scan with column family filter - limit results per column prefix

2018-05-01 Thread Ted Yu
>From your description, you can combine ColumnPrefixFilter with PageFilter (thru FilterList). FYI On Tue, May 1, 2018 at 6:06 AM, mrmiroslav wrote: > I'd like to perform Get / Scan with java client. > > In a given column family I'd like to limit the number of results per given > column qualifi

hbase scan with column family filter - limit results per column prefix

2018-05-01 Thread mrmiroslav
I'd like to perform Get / Scan with java client. In a given column family I'd like to limit the number of results per given column qualifier prefix. Is there a way of achieving this without redesigning schema and moving Column Qualifiers into Column Families? Column Qualifiers are dynamic, howev

Re: Hbase scan returns row result twice

2017-07-23 Thread Ted Yu
You can call the following method of Result to see if any Result is partial: public boolean isPartial() { FYI On Fri, Jul 21, 2017 at 9:49 AM, Veerraju Tadimeti wrote: > Hi, > > I am scanning hbase through Hive. I am using a coprocessor implementing > postScannerNext(). For each result, I

Hbase scan returns row result twice

2017-07-21 Thread Veerraju Tadimeti
Hi, I am scanning hbase through Hive. I am using a coprocessor implementing postScannerNext(). For each result, I am executing Get operation. postScannerNext(final ObserverContext e, final InternalScanner s, final List results, final int limit, final boolean hasMore){

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-02 Thread Hef
t; > > >> > > > > > >> > > - The row key is defined as: > > > >> > > sault(1byte) + time_of_hour(4bytes) + uuid(36bytes) > > > >> > > > > > >> > > > > > >> > > - The scan is created as below: > > > >> > > > > > >> > > Scan scan = new Scan(); > > > >> > > scan.setBatch(100); > > > >> > > scan.setCaching(1); > > > >> > > scan.setCacheBlocks(false); > > > >> > > scan.setMaxVersions(1); > > > >> > > > > > >> > > > > > >> > > And the row filter for the scan is a FuzzyRowFilter that filters > > > only > > > >> > > events of a given time_of_hour. > > > >> > > > > > >> > > Everything looks fine while the result is out of expect. > > > >> > > A same task runs 10 times, the Input Records counters show 6 > > > different > > > >> > > numbers, and the final output shows 6 different results. > > > >> > > > > > >> > > Does anyone has every faced this problem before? > > > >> > > What could be the cause of this inconsistency of HBase scan > > result? > > > >> > > > > > >> > > Thanks > > > >> > > > > >> > > > > > >

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-02 Thread Ted Yu
gt;> > > - The scan is created as below: > > >> > > > > >> > > Scan scan = new Scan(); > > >> > > scan.setBatch(100); > > >> > > scan.setCaching(1); > > >> > > scan.setCacheBlocks(false); > > >> > > scan.setMaxVersions(1); > > >> > > > > >> > > > > >> > > And the row filter for the scan is a FuzzyRowFilter that filters > > only > > >> > > events of a given time_of_hour. > > >> > > > > >> > > Everything looks fine while the result is out of expect. > > >> > > A same task runs 10 times, the Input Records counters show 6 > > different > > >> > > numbers, and the final output shows 6 different results. > > >> > > > > >> > > Does anyone has every faced this problem before? > > >> > > What could be the cause of this inconsistency of HBase scan > result? > > >> > > > > >> > > Thanks > > >> > > > >> > > >

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-02 Thread Hef
e scan is created as below: > >> > > > >> > > Scan scan = new Scan(); > >> > > scan.setBatch(100); > >> > > scan.setCaching(1); > >> > > scan.setCacheBlocks(false); > >> > > scan.setMaxVersions(1); > >> > > > >> > > > >> > > And the row filter for the scan is a FuzzyRowFilter that filters > only > >> > > events of a given time_of_hour. > >> > > > >> > > Everything looks fine while the result is out of expect. > >> > > A same task runs 10 times, the Input Records counters show 6 > different > >> > > numbers, and the final output shows 6 different results. > >> > > > >> > > Does anyone has every faced this problem before? > >> > > What could be the cause of this inconsistency of HBase scan result? > >> > > > >> > > Thanks > >> > > >> >

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-01 Thread Sean Busbey
> > > scan.setCaching(1); >> > > scan.setCacheBlocks(false); >> > > scan.setMaxVersions(1); >> > > >> > > >> > > And the row filter for the scan is a FuzzyRowFilter that filters only >> > > events of a given time_of_hour. >> > > >> > > Everything looks fine while the result is out of expect. >> > > A same task runs 10 times, the Input Records counters show 6 different >> > > numbers, and the final output shows 6 different results. >> > > >> > > Does anyone has every faced this problem before? >> > > What could be the cause of this inconsistency of HBase scan result? >> > > >> > > Thanks >> > >>

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-01 Thread Ted Yu
gt; > > > > > > > > And the row filter for the scan is a FuzzyRowFilter that filters only > > > events of a given time_of_hour. > > > > > > Everything looks fine while the result is out of expect. > > > A same task runs 10 times, the Input Records counters show 6 different > > > numbers, and the final output shows 6 different results. > > > > > > Does anyone has every faced this problem before? > > > What could be the cause of this inconsistency of HBase scan result? > > > > > > Thanks > > >

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-01 Thread Hef
hat filters only > > events of a given time_of_hour. > > > > Everything looks fine while the result is out of expect. > > A same task runs 10 times, the Input Records counters show 6 different > > numbers, and the final output shows 6 different results. > > > > Does anyone has every faced this problem before? > > What could be the cause of this inconsistency of HBase scan result? > > > > Thanks >

Re: HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-01 Thread Ted Yu
times, the Input Records counters show 6 different > numbers, and the final output shows 6 different results. > > Does anyone has every faced this problem before? > What could be the cause of this inconsistency of HBase scan result? > > Thanks

HBase scan returns inconsistent results on multiple runs for same dataset

2017-03-01 Thread Hef
xpect. A same task runs 10 times, the Input Records counters show 6 different numbers, and the final output shows 6 different results. Does anyone has every faced this problem before? What could be the cause of this inconsistency of HBase scan result? Thanks

RE: Internals of hbase scan

2016-12-28 Thread andrew
Reading the source code -Original Message- From: Rajeshkumar J [mailto:rajeshkumarit8...@gmail.com] Sent: 2016年12月28日 17:00 To: user@hbase.apache.org Subject: Internals of hbase scan Can anyone point me where I can learn internals of hbase such as scan in depth.

Re: Internals of hbase scan

2016-12-28 Thread Ted Yu
You can step through using IDE. Below is sample stack trace from running a unit test: ClientSimpleScanner(ClientScanner).loadCache() line: 382 ClientSimpleScanner(ClientScanner).nextWithSyncCache() line: 357 ClientSimpleScanner.next() line: 50 ResultScanner$1.hasNext() line: 54 TestMobCompactor.c

Re: Internals of hbase scan

2016-12-28 Thread Rajeshkumar J
Thanks for ur reply. Whenever I issue ResultScanner.next() can you point me the class where will be the control will switch to? On Wed, Dec 28, 2016 at 6:04 PM, Ted Yu wrote: > You can start from http://hbase.apache.org/book.html#hregion.scans > > To get to know internals, you should look at the

Re: Internals of hbase scan

2016-12-28 Thread Ted Yu
You can start from http://hbase.apache.org/book.html#hregion.scans To get to know internals, you should look at the code - in IDE such as Eclipse. Start from StoreScanner and read the classes which reference it. Cheers On Wed, Dec 28, 2016 at 12:59 AM, Rajeshkumar J wrote: > Can anyone point m

Internals of hbase scan

2016-12-28 Thread Rajeshkumar J
Can anyone point me where I can learn internals of hbase such as scan in depth.

Re: does hbase scan doubts

2016-03-22 Thread Jerry He
: > Does hbase scan or get is single threaded? > Say I have hbase table with 100 regionservers. > > When I scan a key rangle say a-z(distributed on all regionservers), will > the client make calls to regionservers in parallel all at once or one by > one.First it will get all keys from one

does hbase scan doubts

2016-03-13 Thread Shushant Arora
Does hbase scan or get is single threaded? Say I have hbase table with 100 regionservers. When I scan a key rangle say a-z(distributed on all regionservers), will the client make calls to regionservers in parallel all at once or one by one.First it will get all keys from one regionserver then

Pagination of HBase Scan output

2016-01-05 Thread Sreeram Venkatasubramanian
Hi, I am looking for options to batch the output of HBase scan with prefix filter, so that it can be paginated at the front end. Please let me know if there recommended methods to do the same. Thank you. Sreeram= CAUTION - Disclaimer * This e-mail contains

Re: Pagination of HBase Scan output

2016-01-05 Thread Ted Yu
for options to batch the output of HBase scan with prefix > filter, so that it can be paginated at the front end. > > Please let me know if there recommended methods to do the same. > > Thank you. > > Sreeram= > CAUTION - Disclaimer

Re: HBase scan time range, inconsistency

2015-02-27 Thread ramkrishna vasudevan
In your case since the TTL is set to the max and you have a timeRange in your scan it would go with the first case. Every time it would try to fetch only one version ( the latest) for the given record but if the time Range is not falling in the latest then it would skip those cells. But my doubt is

Re: HBase scan time range, inconsistency

2015-02-26 Thread Stephen Durfey
Right, it is 1 by default, but shouldn't the Scan return a version from within that time range if there is one? Without the number of versions specified, I thought it was the most recent version, is that the most recent version within the time range, or the most recent version in the history of the

Re: HBase scan time range, inconsistency

2015-02-26 Thread Ted Yu
The maxVersions field of Scan object is 1 by default: private int maxVersions = 1; Cheers On Thu, Feb 26, 2015 at 12:31 PM, Stephen Durfey wrote: > > > > 1) What do you mean by saying your have a partitioned HBase table? > > (Regions and partitions are not the same) > > > By partitions, I ju

Re: HBase scan time range, inconsistency

2015-02-26 Thread Stephen Durfey
> > 1) What do you mean by saying your have a partitioned HBase table? > (Regions and partitions are not the same) By partitions, I just mean logical partitions, using the row key to keep data from separate data sources apart from each other. I think the issue may be resolved now, but it isn't o

Re: HBase scan time range, inconsistency

2015-02-26 Thread Michael Segel
Ok… Silly question time… so just humor me for a second. 1) What do you mean by saying your have a partitioned HBase table? (Regions and partitions are not the same) 2) There’s a question of the isolation level during the scan. What happens when there is a compaction running or there’s RLL t

Re: HBase scan time range, inconsistency

2015-02-25 Thread Stephen Durfey
> > Are you writing any Deletes? Are you writing any duplicates? No physical deletes are occurring in my data, and there is a very real possibility of duplicates. How is the partitioning done? > The key structure would be /partition_id/person_id I'm dealing with clinical data, with a data

Re: HBase scan time range, inconsistency

2015-02-25 Thread Sean Busbey
Are you writing any Deletes? Are you writing any duplicates? How is the partitioning done? What does the entire key structure look like? Are you doing the column filtering with a custom filter or one of the prepackaged ones? On Wed, Feb 25, 2015 at 12:57 PM, Stephen Durfey wrote: > > > > What

Re: HBase scan time range, inconsistency

2015-02-25 Thread Stephen Durfey
> > What's the TTL setting for your table ? > > Which hbase release are you using ? > > Was there compaction in between the scans ? > > Thanks > The TTL is set to the max. The HBase version is 0.94.6-cdh4.4.0. I don’t want to say compactions aren’t a factor, but the jobs are short lived (4-5 minut

Re: HBase scan time range, inconsistency

2015-02-24 Thread ramkrishna vasudevan
>> These numbers have varied wildly, from being off by 2-3 between subsequent scans to 40 row increases, followed by a drop of 70 rows. When you say there is a variation in the number of rows retrieved - the 40 rows that got increased - are those rows in the expected time range? Or is the system re

Re: HBase scan time range, inconsistency

2015-02-24 Thread Ted Yu
What's the TTL setting for your table ? Which hbase release are you using ? Was there compaction in between the scans ? Thanks > On Feb 24, 2015, at 2:32 PM, Stephen Durfey wrote: > > I have some code that accepts a time range and looks for data written to an > HBase table during that range

HBase scan time range, inconsistency

2015-02-24 Thread Stephen Durfey
I have some code that accepts a time range and looks for data written to an HBase table during that range. If anything has been written for that row during that range, the row key is saved off, and sometime later in the pipeline those row keys are used to extract the entire row. I’m testing agai

Re: Hbase scan using TIMERANGE

2015-02-05 Thread Bing Jiang
> them. Also i need the counts per day wise where i pass the date > > parameter > > > > to > > > > the shell script which calls these scan commands. I did find a way to > > > > convert the date to epoch time and pass it to scan command but the > s

Re: Hbase scan using TIMERANGE

2015-02-04 Thread Ted Yu
gt; > > keeps running forever. Can some one help me in making this faster. > > > > > > Note: I am scanning the tables based on TIMERANGE as all the tables > have > > > this field. > > > > > > Thanks, > > > Yogi > > > > > > > > > > > > -- > > > View this message in context: > > > > > > http://apache-hbase.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html > > > Sent from the HBase User mailing list archive at Nabble.com. > > > > > >

Re: Hbase scan using TIMERANGE

2015-02-04 Thread Bing Jiang
mmand but the scan > > keeps running forever. Can some one help me in making this faster. > > > > Note: I am scanning the tables based on TIMERANGE as all the tables have > > this field. > > > > Thanks, > > Yogi > > > > > > > > -- &

Re: Hbase Scan/Snapshot Performance...

2014-08-12 Thread Ted Yu
Gautum: See also HBASE-10642 which went into 0.94.18 You can do rolling upgrade from 94.6 to 94.21 Cheers On Tue, Aug 12, 2014 at 5:42 PM, Gautam wrote: > Thanks for the replies.. > > Matteo, > > We'r running 94.6 since February so, sadly the prod cluster doesn't have > this SKIP_FLUSH opti

Re: Hbase Scan/Snapshot Performance...

2014-08-12 Thread Gautam
Thanks for the replies.. Matteo, We'r running 94.6 since February so, sadly the prod cluster doesn't have this SKIP_FLUSH option right now. Would be great if there are options I could use right now until we upgrade to 98. Ted, Thanks for the jira. That is exactly what we intend to use for

Re: Hbase Scan/Snapshot Performance...

2014-08-12 Thread Ted Yu
Gautum: Please take a look at this: HBASE-8369 MapReduce over snapshot files Cheers On Tue, Aug 12, 2014 at 3:11 PM, Matteo Bertozzi wrote: > There is HBASE-10935, included in 0.94.21 where you can specify to skip > the memstore flush and the result will be the online version of an "offline >

Re: Hbase Scan/Snapshot Performance...

2014-08-12 Thread Matteo Bertozzi
There is HBASE-10935, included in 0.94.21 where you can specify to skip the memstore flush and the result will be the online version of an "offline snapshot" snapshot 'sourceTable', 'snapshotName', {SKIP_FLUSH => true} On Tue, Aug 12, 2014 at 10:58 PM, Gautam wrote: > Hello, > > We'v b

Hbase Scan/Snapshot Performance...

2014-08-12 Thread Gautam
Hello, We'v been using and loving Hbase for couple of months now. Our primary usecase for Hbase is writing events in stream to an online time series Hbase table. Every so often we run medium to large batch scan MR jobs on sections (1hour, 1 day, 1 week) of this same time series table. This o

Re: Hbase scan using TIMERANGE

2014-06-28 Thread Ted Yu
TIMERANGE as all the tables have > this field. > > Thanks, > Yogi > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html > Sent from the HBase User mailing list archive at Nabble.com. >

Hbase scan using TIMERANGE

2014-06-28 Thread yogi
.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html Sent from the HBase User mailing list archive at Nabble.com.

Limiting number of records in Hbase Scan

2014-05-19 Thread krish_571
Is there any java api to limit the number of scanned records after using start and stop rows? Is pagefilter an option? -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Limiting-number-of-records-in-Hbase-Scan-tp4059401.html Sent from the HBase User mailing list archive

Re: Query HBase : Scan : Fwd: [New comment] HBase shell commands

2014-03-31 Thread Ted Yu
For #1, please take a look at the following method in HTable: public boolean checkAndPut(final byte [] row, final byte [] family, final byte [] qualifier, final byte [] value, final Put put) For #2, can you clarify your goal ? Java API provides stronger capability compared to she

Query HBase : Scan : Fwd: [New comment] HBase shell commands

2014-03-31 Thread Pritesh Prajapati
*As we had configured HBase 0.94.1 pseudo Distributed mode Hadoop 1.0.3 & It's working fine.* *we have several queries ..* *1.How to perform search operation with particular Value & Compare it within the HBase ? * *i.e. We stored XYZ value in HBase now for next time before storing New Value in HBa

Re: HBase scan performance decreases over time.

2012-11-05 Thread Leonid Fedotov
There is property dfs.balance.bandwidthPerSec in hdfs-site.xml dfs.balance.bandwidthPerSec 625 Specifies the maximum amount of bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second. Thank you

Re: HBase scan performance decreases over time.

2012-11-05 Thread Michael Segel
hdfs-site.xml Its an HDFS setting that may impact the balancing of HBase as well. (I'm sure someone can give a better response by looking at the code. ) On Nov 5, 2012, at 12:14 PM, Asaf Mesika wrote: > Where is this settings located? > > Sent from my iPhone > > On 5 בנוב 2012, at 15:05, M

Re: HBase scan performance decreases over time.

2012-11-05 Thread Asaf Mesika
Where is this settings located? Sent from my iPhone On 5 בנוב 2012, at 15:05, Michael Segel wrote: > There's an HDFS bandwidth setting which is set to 10MB/s. > > Way too low for even 1GBe. > > Have you modified this setting yet? > > -Mike > > On Nov 3, 2012, at 2:50 PM, David Koch wrote: > >>

Re: HBase scan performance decreases over time.

2012-11-05 Thread Michael Segel
There's an HDFS bandwidth setting which is set to 10MB/s. Way too low for even 1GBe. Have you modified this setting yet? -Mike On Nov 3, 2012, at 2:50 PM, David Koch wrote: > Hello Ted, > > We never initiate major compaction manually. I have not looked at I/O > balance between nodes in det

Re: HBase scan performance decreases over time.

2012-11-03 Thread Ted Yu
Have you looked at http://hbase.apache.org/book.html#performance ? Thanks On Sat, Nov 3, 2012 at 12:50 PM, David Koch wrote: > Hello Ted, > > We never initiate major compaction manually. I have not looked at I/O > balance between nodes in detail. We have noticed that after running for a > coupl

Re: HBase scan performance decreases over time.

2012-11-03 Thread David Koch
Hello Ted, We never initiate major compaction manually. I have not looked at I/O balance between nodes in detail. We have noticed that after running for a couple of weeks HBase seems to spend hours pushing blocks between nodes in order to optimize things. We add data daily in one ~30gb push to sev

Re: HBase scan performance decreases over time.

2012-11-03 Thread Ted Yu
Can you tell us how often you run major compaction after the import ? Have you noticed imbalanced read / write requests in the cluster ? Meaning subset of region servers receive bulk of the writes. We do some manual movement of regions when the above happens. Cheers On Sat, Nov 3, 2012 at 8:12 A

HBase scan performance decreases over time.

2012-11-03 Thread David Koch
Hello, Every now and then we need to flatten our cluster and re-import all data from log files (changes in data format, etc.) Afterwards we notice a significant increase in scan performance. As data is added and shuffled around between region servers, performance goes down again over time (say a c

Re: Hbase Scan - number of columns make the query performance way different

2012-09-17 Thread Alex Baranau
gt; On Thu, Sep 13, 2012 at 7:35 AM, Shengjie Min > wrote: > > > In my case, I am not feeding hbase result to mapred, it's just pure hbase > > scan, returning all columns vs two columns makes huge difference to me. > > > > On 13 September 2012 15:29, Dou

Re: Hbase Scan - number of columns make the query performance way different

2012-09-13 Thread Jacques
my case, I am not feeding hbase result to mapred, it's just pure hbase > scan, returning all columns vs two columns makes huge difference to me. > > On 13 September 2012 15:29, Doug Meil > wrote: > > > > > Hi there, I don't know the specifics of your environ

Re: Hbase Scan - number of columns make the query performance way different

2012-09-13 Thread Shengjie Min
In my case, I am not feeding hbase result to mapred, it's just pure hbase scan, returning all columns vs two columns makes huge difference to me. On 13 September 2012 15:29, Doug Meil wrote: > > Hi there, I don't know the specifics of your environment, but ... > > h

Re: Hbase Scan - number of columns make the query performance way different

2012-09-13 Thread Doug Meil
y the columns you need means you are reducing the data transferred between the RS and the client and the number of KV's evaluated in the RS, etc. On 9/13/12 10:12 AM, "Shengjie Min" wrote: >Hi, > >I found an interesting difference between hbase scan query. > >I ha

Re: HBase- Scan with wildcard character

2011-12-15 Thread Sreeram K
Thanks for the reply. But that is from Java ..I am looking from the HBase shell? - Original Message - From: Stack To: user@hbase.apache.org; Sreeram K Cc: Sent: Thursday, December 15, 2011 10:10 AM Subject: Re: HBase- Scan with wildcard character On Thu, Dec 15, 2011 at 8:59 AM

Re: HBase- Scan with wildcard character

2011-12-15 Thread Stack
On Thu, Dec 15, 2011 at 8:59 AM, Sreeram K wrote: > I have one more question.. > Can we have a query in HBase shell based on Colum Value. > > I am looking at scan-> with Coulm ID? is that possible..the way we are doing > with STARTROW? > Can you pl pont me to an example.. > > You need to use a v

Re: HBase- Scan with wildcard character

2011-12-15 Thread Sreeram K
" ; lars hofhansl Cc: Sent: Wednesday, December 14, 2011 6:46 AM Subject: Re: HBase- Scan with wildcard character Thank you Lars. STOPROW did work in my hbase shell as you suggested - Original Message - From: lars hofhansl To: "user@hbase.apache.org" ; Sreeram K Cc:

Re: HBase- Scan with wildcard character

2011-12-14 Thread Sreeram K
Thank you Lars. STOPROW did work in my hbase shell as you suggested - Original Message - From: lars hofhansl To: "user@hbase.apache.org" ; Sreeram K Cc: Sent: Tuesday, December 13, 2011 3:56 PM Subject: Re: HBase- Scan with wildcard character The shell lets you only do

Re: HBase- Scan with wildcard character

2011-12-13 Thread lars hofhansl
r row keys. -- Lars From: Sreeram K To: "user@hbase.apache.org" ; lars hofhansl Sent: Tuesday, December 13, 2011 2:16 PM Subject: Re: HBase- Scan with wildcard character Thanks Doug. I am looking more from HBase shell for this. - Original Message --

Re: HBase- Scan with wildcard character

2011-12-13 Thread Sreeram K
Thanks Doug. I am looking more from HBase shell for this. - Original Message - From: Doug Meil To: "user@hbase.apache.org" ; Sreeram K ; lars hofhansl Cc: Sent: Tuesday, December 13, 2011 2:01 PM Subject: Re: HBase- Scan with wildcard character Hi there- At some po

Re: HBase- Scan with wildcard character

2011-12-13 Thread Doug Meil
gt;- Original Message - >From: lars hofhansl >To: "user@hbase.apache.org" ; Sreeram K > >Cc: >Sent: Tuesday, December 13, 2011 11:36 AM >Subject: Re: HBase- Scan with wildcard character > >info:regioninfo is actually a serialized Java object (HRegionInfo

Re: HBase- Scan with wildcard character

2011-12-13 Thread Sreeram K
ase shell - Original Message - From: lars hofhansl To: "user@hbase.apache.org" ; Sreeram K Cc: Sent: Tuesday, December 13, 2011 11:36 AM Subject: Re: HBase- Scan with wildcard character info:regioninfo is actually a serialized Java object (HRegionInfo). What you see in the

Re: HBase- Scan with wildcard character

2011-12-13 Thread lars hofhansl
;user@hbase.apache.org" ; lars hofhansl Sent: Tuesday, December 13, 2011 12:16 AM Subject: Re: HBase- Scan with wildcard character Thanks Lars, I will look into that . one more question: on hbase shell. If I have :            hbase> scan 't1.', {COLUMNS => 'info:regionin

Re: HBase- Scan with wildcard character

2011-12-13 Thread Sreeram K
Thanks Lars, I will look into that . one more question: on hbase shell. If I have :            hbase> scan 't1.', {COLUMNS => 'info:regioninfo'}  , it is printing all the colums of regioninfo. can I have a condition like:if colum,info.regioninfo=2 (value) than

Re: HBase- Scan with wildcard character

2011-12-12 Thread lars hofhansl
rt after any valid hex digit), and the scanner will automatically stop at the last possible match. Have a look at the the Scan object and HTable.getScanner(...) -- Lars - Original Message - From: Sreeram K To: "user@hbase.apache.org" Cc: Sent: Monday, December 12, 2011 6

HBase- Scan with wildcard character

2011-12-12 Thread Sreeram K
Hi, I have a Table defined with the 3 columns. I am looking for a query in HBase shell to print all the values starting with some characters in Rowkey. Example: My rowids are:Coulm+Key 4E11676773AC3B6E9A3FE1CCD1051B8C&1323736118749497       colum=xx:size,timestamp=67667767,value= 4E1167677

Re: hbase scan job property

2011-11-29 Thread Jean-Daniel Cryans
would basically be like re-implementing a language like Hive. J-D On Tue, Nov 29, 2011 at 3:57 AM, tousif wrote: > *Hi, > > How can i specify job property for hbase scan prefix filter.  for example > to scan columns i provide > hbase.mapreduce.scancolumn_family. > > > * > -- > Regards > Tousif >

hbase scan job property

2011-11-29 Thread tousif
*Hi, How can i specify job property for hbase scan prefix filter. for example to scan columns i provide hbase.mapreduce.scancolumn_family. * -- Regards Tousif

Re: HBase Scan returns fewer columns after a few minutes of insertion

2011-08-30 Thread Jean-Daniel Cryans
If you want to limit the number of rows you can instead set the caching to exactly what you need, or set a stop row. J-D On Mon, Aug 29, 2011 at 11:38 PM, Neerja Bhatnagar wrote: > Hi J-D, > > Thank you very much! Hopefully, this iteration clears it up for me. > The batchSize is set to 1. I trie

Re: HBase Scan returns fewer columns after a few minutes of insertion

2011-08-29 Thread Jean-Daniel Cryans
(Sending to user@ again and bccing dev@ for the last time, please take notice and reply to user@) Ok so it should be something about the code... what is batchSize set to? I don't see it in that code snippet. getMap gives a map of all the families with all the data, whereas getFamily gives a map o

Re: HBase Scan returns fewer columns after a few minutes of insertion

2011-08-29 Thread Jean-Daniel Cryans
Hi, > > I am sorry if this question has been resolved before. Thank you for your > help. > > I am seeing really strange behavior with HBase Scan. > > I insert 1 row into a table named test, 1 col family named testColFam, and 3 > columns : foo (with value foo), bar (with va

Re: HBase Scan

2011-01-04 Thread Ryan Rawson
Forward order only. On Jan 4, 2011 6:17 PM, "King JKing" wrote: > Dear all, > > Does HBase support scan by both reverse order and normal order? > > Thank a lot for support.

HBase Scan

2011-01-04 Thread King JKing
Dear all, Does HBase support scan by both reverse order and normal order? Thank a lot for support.

Re: Hbase bulk insert vs Hbase scan

2010-09-13 Thread Ted Yu
or a simple function on Hbase? > > Another question: What is the difference in using ImmutableBytesWritable > or > not using that for the key? > > Thanks in advance! > -- > View this message in context: > http://old.nabble.com/Hbase-bulk-insert-vs-Hbase-scan-tp296

Hbase bulk insert vs Hbase scan

2010-09-13 Thread fitman_82
View this message in context: http://old.nabble.com/Hbase-bulk-insert-vs-Hbase-scan-tp29696401p29696401.html Sent from the HBase User mailing list archive at Nabble.com.