Scan a region in parallel

2016-10-20 Thread Anil
HI, I am loading hbase table into an in-memory db to support filter, ordering and pagination. I am scanning region and inserting data into in-memory db. each region scan is done in single thread so each region is scanned in parallel. Is there any way to scan a region in parallel ? any pointers w

Re: Scan a region in parallel

2016-10-20 Thread Anil
Any pointers ? On 20 October 2016 at 18:15, Anil wrote: > HI, > > I am loading hbase table into an in-memory db to support filter, ordering > and pagination. > > I am scanning region and inserting data into in-memory db. each region > scan is done in single thread so each

Re: Scan a region in parallel

2016-10-21 Thread Anil
se/phoenix table into in-memory data base for faster access. scanning of big region sequentially will lead to larger load time. so finding ways to minimize the load time. Hope this helps. Thanks. On 21 October 2016 at 09:30, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> w

Re: Scan a region in parallel

2016-10-21 Thread Anil
arallelism happens by using guideposts - those are fixed spaced > row keys stored in a seperate stats table. So when you do a query the > Phoenix internally spawns parallels scan queries using those guide posts > and thus making querying faster. > > Regards > Ram > > On

Parallel Scanner

2017-02-18 Thread Anil
Hi , I am building an usecase where i have to load the hbase data into In-memory database (IMDB). I am scanning the each region and loading data into IMDB. i am looking at parallel scanner ( https://issues.apache.org/jira/browse/HBASE-8504 ) and HTable# getRegionsInRange(byte[] startKey, byte[] e

Parallel Scanner

2017-02-18 Thread Anil
Hi , I am building an usecase where i have to load the hbase data into In-memory database (IMDB). I am scanning the each region and loading data into IMDB. i am looking at parallel scanner ( https://issues.apache.org/ jira/browse/HBASE-8504, HBASE-1935 ) to reduce the load time and HTable# getReg

Re: Parallel Scanner

2017-02-19 Thread Anil
is better you try to get the regions every time before you issue a scan" - Agree. i am dynamically determining the region start key and end key before initiating scan operations for every initial load. Thanks. On 20 February 2017 at 10:59, ramkrishna vasudevan < ramkrishna.s.vasude...@

Re: Parallel Scanner

2017-02-19 Thread Anil
hbase/entry/coprocessor_introduction. > > Be careful when you use them. Since these endpoints run on server ensure > that these are not heavy or things that consume more memory which can have > adverse effects on the server. > > > Regards > Ram > > On Mon, Feb 20, 2017 at

Re: Parallel Scanner

2017-02-20 Thread Anil
parallel scan. Note you can also group the scans by region server. > > Cheers, > Richard > On 20 Feb 2017, at 07:33, Anil mailto:ani > lk...@gmail.com>> wrote: > > Thanks Ram. I will look into EndPoints. > > On 20 February 2017 at 12:29, ramkrishna vasudevan <

Re: Parallel Scanner

2017-02-20 Thread Anil
your regions which you have some control over. You can achieve > the same effect by pre-splitting your table such that you empirically > optimise read performance for the dataset you store. > > > Thanks, > > Richard > > > > From: An

Re: Parallel Scanner

2017-02-20 Thread Anil
wrote: > You are trying to scan one region itself in parallel, then even I got you > wrong. Richard's suggestion is the right choice for client only soln. > > On Mon, Feb 20, 2017 at 7:40 PM, Anil wrote: > > > Thanks Richard :) > > > > On 20 February 20

Re: Parallel Scanner

2017-02-20 Thread Anil
wrote: > Anil: > What's the current region size you use ? > > Given a region, do you have some idea how the data is distributed within > the region ? > > Cheers > > On Mon, Feb 20, 2017 at 7:14 AM, Anil wrote: > > > i understand my original post now :) So

Re: Parallel Scanner

2017-02-20 Thread Anil
; If you cannot change the schema, do you have control over the region size ? > Smaller region may lower the variance in data distribution per region. > > On Mon, Feb 20, 2017 at 7:47 AM, Anil wrote: > > > Hi Ted, > > > > Current region size is 10 GB. > > > >

Re: Parallel Scanner

2017-02-20 Thread Anil
Hi Ted, Thanks. I will go through phoenix code. Thanks. On 20 February 2017 at 21:50, Ted Yu wrote: > Please read https://phoenix.apache.org/update_statistics.html > > FYI > > On Mon, Feb 20, 2017 at 8:14 AM, Anil wrote: > > > Hi Ted, > > > > it

Re: Using doubles and longs as ordering row values

2012-11-05 Thread anil gupta
-java-rounding-double-issue You might get wrong results due problems in Double. You can use two long or int to store the decimal value as RowKey. HTH, Anil Gupta On Mon, Nov 5, 2012 at 11:38 AM, Jean-Daniel Cryans wrote: > On Mon, Nov 5, 2012 at 10:41 AM, Jonathan Bishop > wrote:

Re: Hive and Hbase performance

2012-11-17 Thread anil gupta
Hi Dalia, The usual and short answer is Yes. Both, HBase and Hive will provide better performance on adding more nodes since they provide horizontal scalability. HTH, Anil On Sat, Nov 17, 2012 at 4:02 PM, Dalia Sobhy wrote: > > I want to ask if a Hive Count or Scan Query will provide

Re: Problem deleting neighboors with timestamp=0

2012-11-18 Thread anil gupta
value 0 as timestamp? HBase is used by a variety of users in quite different use cases. So, i dont think it would be a good idea of introducing this restriction. HTH, Anil On Sun, Nov 18, 2012 at 9:45 PM, Chris Larsen wrote: > > So you mean that you have explicitly set the timestamp to 0

Re: Problem with xml data in hbase bulk loading

2012-11-20 Thread anil gupta
at ImportTsv class in HBase. HTH, Anil Gupta On Tue, Nov 20, 2012 at 3:59 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Hi > > In csv files, new line = new entry :( > > So I think your only option is to fix your input file by removing your > extra lines. >

Re: hbase error

2012-11-21 Thread anil gupta
ed in logs. HTH, Anil On Wed, Nov 21, 2012 at 8:32 AM, wanli gao wrote: > hi all, > i m now running hbase on pseudo-distributed mode, and when i open hbase > webUI: http://localhost:60010/master.jsp,its give an error: > Trying to contact region server null for region , row ''

Re: Custom versioning best practices

2012-11-22 Thread anil gupta
Hi David, As per my knowledge, HBase currently doesn't supports specifying separate setMaxVersion for different column family in a single Scan object. HTH, Anil On Thu, Nov 22, 2012 at 12:47 PM, David Koch wrote: > Hello Michael, > > Thank you for your response. > &g

Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-25 Thread anil gupta
Hi Rams, The description of your use case is very abstract so i will try to answer your question to the best of my ability. 1) Whether is it good to create a single table for all the 600+ columns? Anil: Yes, it is absolutely ok to have 600+ columns in a row in HBase (you can go max upto few

Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-25 Thread anil gupta
More on number of column families: http://hbase.apache.org/book/number.of.cfs.html On Sun, Nov 25, 2012 at 11:35 PM, anil gupta wrote: > Hi Rams, > > The description of your use case is very abstract so i will try to answer > your question to the best of my ability. > > >

Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-27 Thread anil gupta
e detailed requirements, constraints and carrying out couple of experiments. HTH, Anil On Tue, Nov 27, 2012 at 8:44 PM, Ramasubramanian < ramasubramanian.naraya...@gmail.com> wrote: > Hi, > > Thanks!! > > Can someone help in suggesting what is the best rowkey that we can use

Re: Best practice for storage of data that changes

2012-11-30 Thread anil gupta
the HBase mailing list also since this is more about HBase. Hope This Helps, Anil Gupta On Thu, Nov 29, 2012 at 8:51 PM, Lance Norskog wrote: > Please! There are lots of blogs etc. about the two, but very few > head-to-head for a real use case. > > -- &

Re: Why InternalScanner doesn't have a method that returns entire row or object of Result

2012-11-30 Thread anil gupta
l/List.html?is-external=true> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/KeyValue.html> > results) Grab the next row's worth of values. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html Thanks, Anil Gupta On Fri, Nov 30,

Re: Why InternalScanner doesn't have a method that returns entire row or object of Result

2012-11-30 Thread anil gupta
Hi Ted, I figured out that i have to use next from InternalScanner. Thanks for the response. The comment for method "Grab the next row's worth of values." was a little confusing to me. "Get the keyValue's for the next row" would have been better. Just saying

HBase Integration with Active Directory

2012-12-03 Thread anil gupta
high level steps. Is it something similar Kerberos integration with Active Directory in Hadoop? -- Thanks & Regards, Anil Gupta

Re: PrefixFilter : not working for 'long' keys

2012-12-04 Thread anil gupta
. I have played with a rowkey which is a sequence of long and short. In my case it works as long as i am matching the entire variable(long,short,byte) in Prefix Filter. HTH, Anil Gupta On Tue, Dec 4, 2012 at 6:36 AM, Mohammad Tariq wrote: > Thank you for the quick response sir. I was think

Re: PrefixFilter : not working for 'long' keys

2012-12-04 Thread Anil Gupta
Hi Mohammad, Let me know the outcome of your experiments. Best Regards, Anil On Dec 4, 2012, at 11:21 AM, Mohammad Tariq wrote: > Hello Anil, > > I see. Your logic sounds appealing. I need to take a few test cases > and test it properly. Thank you for the valuable respone.

Re: HBase Integration with Active Directory

2012-12-08 Thread anil gupta
Hi Harsh, Both of the approach you mentioned would be ok for us. We are aware that Hadoop can be integrated with Active Directory. But, i could not find any such reference for HBase. Do you have any idea about this? Any link or documentation on this would be really helpful. Thanks, Anil Gupta

Re: HBase Integration with Active Directory

2012-12-09 Thread anil gupta
documentation for it. Thanks, Anil Gupta On Sat, Dec 8, 2012 at 11:17 AM, Harsh J wrote: > Hi, > > An KDC can be made to trust an AD, which would solve your need. This > > https://ccp.cloudera.com/display/CDH4DOC/Integrating+Hadoop+Security+with+Active+Directory > is one guide t

Re: HBase Integration with Active Directory

2012-12-09 Thread anil gupta
api before a query is run on the DB and this Oracle instance is integrated to AD). I hope this clarifies my requirement. Thanks, Anil Gupta On Sun, Dec 9, 2012 at 12:58 PM, Harsh J wrote: > Hi, > > Correct me if I'm wrong, but HBase presently has no reliance on the > concept

Re: Checking if a coprocessor was loaded successfully from client

2012-12-10 Thread anil gupta
might be helpful for you. I have used it in past to check whether a coprocessors is successfully added or not. HTH, Anil On Sun, Dec 9, 2012 at 2:25 PM, Ted Yu wrote: > On region server web UI, you should see the list of coprocessors loaded. > > But I guess you're looking for a

Re: HBase Integration with Active Directory

2012-12-10 Thread anil gupta
n that syncs the LDAP ACLs state to the HBase > system table state periodically). > > On Mon, Dec 10, 2012 at 3:17 AM, anil gupta wrote: > > Hi Harsh, > > > > HBase has a concept of ACL. But, these ACL's are maintained as another > > system table "*_acl_*

Re: Bulk Loading from Oracle to Hbase

2012-12-13 Thread anil gupta
Sqoop for loading data into HBase are limited. HTH, Anil Gupta On Thu, Dec 13, 2012 at 8:48 AM, Amandeep Khurana wrote: > Mehmet > > What's the problem you are getting while running the Sqoop job? Can > you give details? > > -Amandeep > > On Thu, Dec 13, 2012

Re: HBase - Secondary Index

2012-12-14 Thread anil gupta
something else? 4. Your region split looks interesting. I dont have much info about it. Can you point to some docs on IndexHalfStoreFileReader? Thanks, Anil Gupta On Tue, Dec 4, 2012 at 12:10 AM, Anoop Sam John wrote: > Hi All > > Last week I got a chance to present th

Re: HBase - Secondary Index

2012-12-14 Thread anil gupta
On Fri, Dec 14, 2012 at 12:54 AM, Anoop Sam John wrote: > Hi Anil, > > >1. In your presentation you mentioned that region of Primary Table and > Region of Secondary Table are always located on the same region server. How > do you achieve it? By using the Primary table rowkey as

Roll of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
ories? Any reference document on it would be highly appreciated? -- Thanks & Regards, Anil Gupta

Re: Roll of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
just dedicate 1 disk for hadoop.tmp.dir or 1 disk is also a overkill for hbase.tmp.dir. Thanks, Anil Gupta On Mon, Dec 17, 2012 at 3:46 PM, Stack wrote: > In refguide we repeat content of hbase-default.xml: > http://hbase.apache.org/book.html#hbase.tmp.dir > > What Nick said plus its used

Re: Role of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
o scale over > >> multiple disks and leaves the dirty work to it. > >> > >> Makes sense to be parallelized for "beefier" standalone instances, but > >> I wonder who uses those and how it may even be done as HBase > >> expects/uses a flat direct

Re: Role of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
FYI, I corrected the typo of "Roll" to "Role". Sorry. On Mon, Dec 17, 2012 at 11:18 PM, anil gupta wrote: > Hi All, > > Thanks a lot for your helpful inputs. > > On Mon, Dec 17, 2012 at 8:04 PM, Harsh J wrote: > >> You're correct - I spoke

Re: Role of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
input. PS: corrected the typo of "Roll" to "Role" On Mon, Dec 17, 2012 at 5:46 PM, Nick Dimiduk wrote: > On Mon, Dec 17, 2012 at 5:20 PM, anil gupta wrote: > > > @Nick: I am using HBase 0.92.1, CompactionTool.java is part of HBase 0.96 > > as per https://is

Re: HBase - Secondary Index

2012-12-18 Thread anil gupta
Hi Anoop, Please find my reply inline. Thanks, Anil Gupta On Sun, Dec 16, 2012 at 8:02 PM, Anoop Sam John wrote: > Hi Anil > During the scan, there is no need to fetch any index data > to client side. So there is no need to create any scanner on the index > table a

Re: HBase - Secondary Index

2012-12-19 Thread anil gupta
nk i need to do some more testing. Thanks, Anil Gupta On Tue, Dec 18, 2012 at 1:27 AM, Anoop Sam John wrote: > Anil: > If the scan from client side does not specify any rowkey range but > only the filter condition, yes it will go to all the primary table regions > for the scan. T

Re: HBase - Secondary Index

2012-12-19 Thread anil gupta
Hi Michael, Please find my replies inline. Thanks, Anil On Tue, Dec 18, 2012 at 1:02 AM, Michel Segel wrote: > Just a couple of questions... > > First, since you don't have any natural secondary indices, you can create > one from a couple of choices. Keeping it simple, you c

Re: CF still contains data after deletion

2012-12-20 Thread anil gupta
Hi Roger, I think you are hitting: https://issues.apache.org/jira/browse/HBASE-6564 The above jira was fixed in HBase0.94.1 and later releases. CDH4.1.2 has HBase0.92.1 so it doesn't contains that fix. HTH, Anil Gupta On Thu, Dec 20, 2012 at 12:43 AM, Roger Miller wrote: > Hello, &g

Re: HBase - Secondary Index

2013-01-06 Thread anil gupta
@Mohit: Here is the jira for prefix compression discussed here: https://issues.apache.org/jira/browse/HBASE-4676 HTH, Anil Gupta On Sun, Jan 6, 2013 at 12:40 PM, Adrien Mogenet wrote: > Are your talking about Data block encoding of K/V ? > https://issues.apache.org/jira/browse/HBAS

Re: HBase - Secondary Index

2013-01-08 Thread anil gupta
, there is no outright winner. ~Anil Gupta On Tue, Jan 8, 2013 at 4:30 PM, lars hofhansl wrote: > Different use cases. > > > For global point queries you want exactly what you said below. > For range scans across many rows you want Anoop's design. As usually it > depends

Re: persistence in Hbase

2013-01-10 Thread anil gupta
Hi Mohammad, If the Write Ahead Log(WAL) is "turned on" then in **NO** case data should be lost. HBase is strongly-consistent. If you know of any case when WAL is turned on and data is lost then IMO that's a Critical bug in HBase. Thanks, Anil Gupta On Thu, Jan 10, 2013 at

Re: persistence in Hbase

2013-01-10 Thread anil gupta
to have this feature in HBase. Thanks, Anil Gupta On Thu, Jan 10, 2013 at 7:54 PM, lars hofhansl wrote: > Not entirely true, though. > Data is not sync'ed to disk, but only distributed to all HDFS replicas. > During a power outage event across all HDFS failure zones (such as a data

Re: persistence in Hbase

2013-01-10 Thread anil gupta
Never mind i got the Jira: https://issues.apache.org/jira/browse/HBASE-5954 On Thu, Jan 10, 2013 at 8:16 PM, anil gupta wrote: > Hi Lars, > > Yes, that is true. I also came to know about it few days ago that data is > present in Memory(rather than persistent storage) of 3 Dat

Re: Maximizing throughput

2013-01-10 Thread anil gupta
y I/O > waits or anything in particular than raises concerns. I am using iostat and > iftop to test throughput. To determine theoretical max, I used dd and > iperf. I have spent quite a bit of time optimizing the HBase config > parameters, optimizing GC, etc., and am familiar with the HBase book online > and such. > -- Thanks & Regards, Anil Gupta

Re: Maximizing throughput

2013-01-10 Thread anil gupta
Sorry, I meant to ask about "setAutoFlush". Is setAutoFlush true or false? On Thu, Jan 10, 2013 at 8:42 PM, anil gupta wrote: > Is flushCommits true or false? > > > On Thu, Jan 10, 2013 at 8:40 PM, Anoop Sam John wrote: > >> Hi >>You mind telling

Re: Coprocessor / threading model

2013-01-12 Thread anil gupta
issues ? How may I solve > my > > issue ? > > > > Help is welcome :-) > > > > -- > > Adrien Mogenet > > 06.59.16.64.22 > > http://www.mogenet.me > -- Thanks & Regards, Anil Gupta

Re: Just joined the user group and have a question

2013-01-17 Thread anil gupta
ve to sacrifice the performance of any one of them. Both of them cannot be optimized. HTH, Anil On Thu, Jan 17, 2013 at 9:34 AM, Doug Meil wrote: > Hi there- > > If you're absolutely new to Hbase, you might want to check out the Hbase > refGuide in the architecture, performanc

Re: RegionSplitter command

2013-01-17 Thread anil gupta
Hi Jean, Yes, i am stuck with 0.92.1. Thanks for your response. I think, i will need to dig more deep into this. ~Anil On Thu, Jan 17, 2013 at 3:02 PM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Hi Anil, > > bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -

Possibly unnecessary check in Result.getColumnLatest(byte[] family, byte[] qualifier)

2013-01-22 Thread anil gupta
, qualifier)) { return kv; } Am i missing something over here? -- Thanks & Regards, Anil Gupta

Re: Possibly unnecessary check in Result.getColumnLatest(byte[] family, byte[] qualifier)

2013-01-22 Thread anil gupta
Hi Ted, Maybe it is out of sync with trunk. Here is the svn link for Result.java in HBase0.94.3: http://svn.apache.org/repos/asf/hbase/tags/0.94.3/src/main/java/org/apache/hadoop/hbase/client/Result.java ~Anil On Tue, Jan 22, 2013 at 10:18 AM, Ted Yu wrote: > I am looking at trunk c

Re: Possibly unnecessary check in Result.getColumnLatest(byte[] family, byte[] qualifier)

2013-01-22 Thread anil gupta
sult.java I mean to say that we should not check for "matchingColumn" since we are already doing the BinarySearch and getting the position of keyValue. I hope i am making some sense. > > On Tue, Jan 22, 2013 at 2:49 PM, anil gupta wrote: > > > Hi Ted, > > > > May

Re: write throughput in cassandra, understanding hbase

2013-01-22 Thread anil gupta
cluster or 2-3k write ops per RS in the cluster? ~Anil On Tue, Jan 22, 2013 at 12:57 PM, Asaf Mesika wrote: > Sent from my iPhone > > On 22 בינו 2013, at 20:47, Jean-Daniel Cryans wrote: > > On Tue, Jan 22, 2013 at 10:38 AM, S Ahmed wrote: > > I've read articles on

Re: Pagination with HBase - getting previous page of data

2013-01-25 Thread anil gupta
page. > 110 120 130 140 150 160 170 180 190 200 is the second one. > > Now, if someone insert 101... If will be just after 100 and before 110. > Anil: Instead of scanning from 010 to 100, scan from 010 to 110. Then we wont have this problem. So, i mean to say that startRow(firstRowKeyof

Re: Pagination with HBase - getting previous page of data

2013-01-25 Thread anil gupta
Inline... On Fri, Jan 25, 2013 at 9:17 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Hi Anil, > > The issue is that all the other sub-sequent page start should be moved > too... > Yes, this is a possibility. Hence the Developer has to take care of this c

Re: Pagination with HBase - getting previous page of data

2013-01-27 Thread anil gupta
That's alright..I thought that you have come-up with a killer solution. So, got curious to hear your ideas. ;) It seems like your below mentioned solution will not work on filtering on non row-key columns since when you are deciding the page numbers you are only considering rowkey. Thanks,

Re: Pagination with HBase - getting previous page of data

2013-01-29 Thread anil gupta
here ;) > > But I'm still thinking about that because I might have to implement > some pagination options soon... > > As you are saying, it's only working on the row-key, but if you want > to do the same-thing on non-rowkey, you might have to create a > secondary index

Re: Pagination with HBase - getting previous page of data

2013-01-29 Thread anil gupta
Hi Jean, Please find my reply inline. On Tue, Jan 29, 2013 at 1:40 PM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Hi Anil, > > I think it really depend on the way you want to use the pagination. > Absolutely true! > > Do you need to be able to jump to p

Re: Pagination with HBase - getting previous page of data

2013-01-30 Thread anil gupta
rs in parallel. Why it cannot guarantee results <= page size( my guess: due to multiple RS scans)? If you have used it then maybe you can explain the behaviour? Thanks, Anil On Tue, Jan 29, 2013 at 7:32 PM, Mohammad Tariq wrote: > I'm kinda hesitant to put my leg in between the pros ;)Bu

Problem in reading Map Output file via RecordReader

2013-01-30 Thread anil gupta
some problem. Can someone tell me the correct way of deserializing the output file of mapper? or There is some problem with my code? Here is the link to my initial stab at RecordReader: https://dl.dropbox.com/u/64149128/ImmutableBytesWritable_Put_RecordReader.java -- Thanks & Regards, Anil Gupta

Re: Pagination with HBase - getting previous page of data

2013-02-02 Thread anil gupta
Hi Anoop, Please find my reply inline. Thanks, Anil On Wed, Jan 30, 2013 at 3:31 AM, Anoop Sam John wrote: > @Anil > > >I could not understand that why it goes to multiple regionservers in > parallel. Why it cannot guarantee results <= page size( my guess: due to > multi

Re: Hbase Read/Write throughput measure

2013-02-03 Thread anil gupta
go. HTH, Anil Gupta On Sun, Feb 3, 2013 at 8:48 AM, Dalia Sobhy wrote: > > Hello Mohamed, > > I believe thats not the case. > > check this blog > http://stackoverflow.com/questions/11649824/hbase-error-not-a-hostport-pair > > It denotes that this error is due to in

Re: Pagination with HBase - getting previous page of data

2013-02-03 Thread anil gupta
On Sun, Feb 3, 2013 at 8:07 AM, Anoop John wrote: > >lets say for a scan setCaching is > 10 and scan is done across two regions. 9 Results(satisfying the filter) > are in Region1 and 10 Results(satisfying the filter) are in Region2. Then > will this scan return 19 (9+10) results?

Re: Pagination with HBase - getting previous page of data

2013-02-03 Thread anil gupta
Inline... On Sun, Feb 3, 2013 at 9:25 AM, Toby Lazar wrote: > Quick question - if you perform the pagination client-side and just > call scanner.iterator().next() > to get to the necessary results, doesn't this add unecessary network > traffic of the unused results? Anil:

Re: Hbase Read/Write throughput measure

2013-02-03 Thread anil gupta
Hi Dalia, Recently, i ran into same problem with YCSB. You will need to recompile YCSB with HBase0.92. Download the maven project from yscb site and recompile it with HBase0.92. HTH, Anil Gupta On Sun, Feb 3, 2013 at 9:49 AM, Dalia Sobhy wrote: > > Hello Mohamed, > > Yes Hbase is

Re: does hbase master need to be a hadoop datanode as well?

2013-02-22 Thread anil gupta
On Feb 22, 2013 11:50 AM, "Jean-Marc Spaggiari" wrote: > > Just to add to Mohammad's advices, you should avoid to run ZK on the same > servers as you are running HBase. > > Reason is, if you are running in long GCs, ZK might miss the heartbeats and > thinks servers are down. So safer to run same s

Re: Welcome our newest Committer Anoop

2013-03-10 Thread Anil Gupta
Congrats & Welcome Anoop!! Best Regards, Anil On Mar 10, 2013, at 9:58 AM, Ted Yu wrote: > Congratulations, Anoop. > > Keep up the good work. > > On Sun, Mar 10, 2013 at 9:42 AM, ramkrishna vasudevan < > ramkrishna.s.vasude...@gmail.com> wrote: > >>

Re: HBase and Datawarehouse

2013-04-28 Thread anil gupta
just store denoramlized data and do simple queries then HBase is good. For OLAP kind of stuff, you can make HBase work but IMO you will be better off using Hive for data warehousing. HTH, Anil Gupta On Sun, Apr 28, 2013 at 8:39 PM, Kiran wrote: > But in HBase data can be said to be

Re: HBase and Datawarehouse

2013-04-29 Thread anil gupta
Inline. On Sun, Apr 28, 2013 at 10:40 PM, Kiran wrote: > Anil, > > So it means HBase can help in easy retrieval and insertions on large > volumes > of data but it lacks the power to analyse and summarize the data? Out of the box, it can do simple aggregations like sum, avg, e

Re: [ANNOUNCE] Phoenix 1.2 is now available

2013-05-16 Thread anil gupta
Hi James, You have mentioned support for TopN query. Can you provide me HBase Jira ticket for that. I am also doing similar stuff in https://issues.apache.org/jira/browse/HBASE-7474. I am interested in knowing the details about that implementation. Thanks, Anil Gupta On Thu, May 16, 2013 at 12

Re: [ANNOUNCE] Phoenix 1.2 is now available

2013-05-16 Thread anil gupta
Hi James, Is this implementation present in the GitHub repo of Phoenix? If yes, can you provide me the package name/classes? I haven't got the opportunity to try out Phoenix yet but i would like to have a look at the implementation. Thanks, Anil Gupta On Thu, May 16, 2013 at 4:15 PM,

Re: Problem accessing table.jsp page in HBase

2012-03-16 Thread anil gupta
se-webapps/master/WEB-INF: total 4 -rwxr-xr-x. 1 root root 1444 Oct 13 20:32 web.xml Can you provide the "ls -lRt" output of your "hbase-webapps" folder or point out if any of the files are missing on my cluster. Thanks, Anil Gupta On Wed, Mar 14, 2012 at 11:40 PM, steven zhuang

HBase bulk loader doing speculative execution when it set to false in mapred-site.xml

2012-03-30 Thread anil gupta
orker node completed the task, hence it means that speculative execution is still on. Why the HBase Bulk loader is doing speculative execution when i have set it to false in mapred-site.xml? Please let me know if i am missing something over here. -- Thanks & Regards, Anil Gupta

Re: HBase bulk loader doing speculative execution when it set to false in mapred-site.xml

2012-03-30 Thread anil gupta
Thanks for the quick reply, Jean. Is there any link where i can find the name of all client-side configuration for HBase? ~Anil On Fri, Mar 30, 2012 at 3:01 PM, Jean-Daniel Cryans wrote: > This is a client-side configuration so if your mapred-site.xml is > _not_ on your classpath when you

Re: HBase bulk loader doing speculative execution when it set to false in mapred-site.xml

2012-03-30 Thread anil gupta
since there is no proper nomenclature for client side properties at present. Thanks for your reply. ~Anil On Fri, Mar 30, 2012 at 3:26 PM, Doug Meil wrote: > > Speculative execution is on by default in Hadoop. One of the Performance > recommendations in the Hbase RefGuide is

Bulk loading job failed when one region server went down in the cluster

2012-03-30 Thread anil gupta
ation which can make Bulk Loading fault-tolerant to failure of region-servers? -- Thanks & Regards, Anil Gupta

Re: Bulk loading job failed when one region server went down in the cluster

2012-03-30 Thread anil gupta
manually bringing down a RS while querying a Table and it worked fine and I was expecting the same today(even though the RS went down by itself today) when i was loading the data. But, it didn't work out well. Thanks for your time. Let me know if you need more details. ~Anil On Fri, Mar 30, 2012

Re: HBase bulk loader doing speculative execution when it set to false in mapred-site.xml

2012-04-02 Thread anil gupta
. Thanks, Anil On Fri, Mar 30, 2012 at 9:54 PM, Harsh J wrote: > Anil, > > You can also disable speculative execution on a per-job basis. See > > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#setMapSpeculativeExecution(boolean) > (Which is

Re: HBase bulk loader doing speculative execution when it set to false in mapred-site.xml

2012-04-02 Thread anil gupta
at the JT would prevent speculative > execution no matter what the client says, but I could be mistaken. > > Sandy > > > -Original Message- > > From: anil gupta [mailto:anilgupt...@gmail.com] > > Sent: Friday, March 30, 2012 14:53 > > To: user@hbase.a

Re: Bulk loading job failed when one region server went down in the cluster

2012-04-03 Thread anil gupta
the biggest selling point of Hadoop platform. Let me know your views. Thanks for your time. Thanks, Anil Gupta On Tue, Apr 3, 2012 at 7:34 AM, Kevin O'dell wrote: > Anil, > > I am sorry for the delayed response. Reviewing the logs it appears: > > 2/03/30 15:38:31 INFO

Re: Storing extremely large size file

2012-04-17 Thread anil gupta
ell > >> wrote: > >>> I think this is a popular topic that might deserve a section in The > Book. > >>> > >>> By "this topic" I mean storing big binary chunks. > >>> > >> > >> Get Jack Levin to write it (smile). > >> > >> And make sure the values are compressed that you send over from the > >> client > >> > >> St.Ack > >> > > -- Thanks & Regards, Anil Gupta

Re: Compare range of numbers on column family

2012-04-20 Thread anil gupta
then you will need to use the BinaryComparator. Hope this Helps -Anil On Fri, Apr 20, 2012 at 3:57 AM, Bijieshan wrote: > Akbar, > > I think you need to customize a comparator yourself. You can't get the > results you want by using BinaryComparator. > Hope I get you co

Re: HBase export fails due to "RetriesExhaustedWithDetailsException: Failed 1 action"

2012-05-03 Thread anil gupta
Hi Petri, If you are getting a " Connection reset by peer." message then most probably its a network problem. Please check your network health. Thanks, Anil Gupta On Thu, May 3, 2012 at 12:04 AM, Petri Väänänen < petri.vaana...@valuemotive.com> wrote: > Hi all, > > I h

Re: Unable to run aggregation using AggregationClient in HBase0.92

2012-05-07 Thread anil gupta
code. Does this mean that "hbase.coprocessor.region.classes" is not a client side configuration? I am just curious to know why it was not working when i was setting the conf through code. Thanks, Anil Gupta On Mon, May 7, 2012 at 2:00 PM, Gary Helml

Re: Unable to run aggregation using AggregationClient in HBase0.92

2012-05-07 Thread anil gupta
Thanks a lot, Gary. On Mon, May 7, 2012 at 2:46 PM, Gary Helmling wrote: > Hi Anil, > > > Does this mean that > > "hbase.coprocessor.region.classes" is not a client side configuration? I > am > > just curious to know why it was not working when i

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-14 Thread anil gupta
HI Stack, I'll look into Gary Helming post and try to do profiling of coprocessor and share the results. Thanks, Anil Gupta On Mon, May 14, 2012 at 12:08 PM, Stack wrote: > On Mon, May 14, 2012 at 12:02 PM, anil gupta > wrote: > > I loaded around 70 thousand 1-2KB records in

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-14 Thread anil gupta
coprocessors i added the logic to add the stopRow manually. What is the reason that Scan object in coprocessor always requires stopRow along with startRow?(code #1 works fine even when i dont use stopRow) Can this restriction be relaxed? Thanks, Anil Gupta On Mon, May 14, 2012 at 12:55 PM, Ted Yu

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-14 Thread anil gupta
ttp://grepcode.com/file_/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.0/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java/?v=source Thanks, Anil Gupta On Mon, May 14, 2012 at 1:58 PM, Ted Yu wrote > Anil: > As code #3 shows, having stopRow helps narrow the ra

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-15 Thread anil gupta
Hi Ted, I created the jira:https://issues.apache.org/jira/browse/HBASE-5999 for fixing this. Creating the patch might take me sometime(due to learning curve) as this is the first time i would be creating a patch. Thanks, Anil Gupta On Mon, May 14, 2012 at 4:00 PM, Ted Yu wrote: > I

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-15 Thread anil gupta
That's true. :) On Tue, May 15, 2012 at 10:47 AM, Ted Yu wrote: > Take your time. > Once you complete your first submission, subsequent contributions would be > easier. > > On Tue, May 15, 2012 at 10:34 AM, anil gupta > wrote: > > > Hi Ted, > > > >

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-15 Thread anil gupta
family."); } } Let me know your thoughts. Thanks, Anil On Tue, May 15, 2012 at 11:46 AM, Ted Yu wrote: > Anil: > I am having trouble accessing JIRA. > > Ted Yu and Zhihong Yu are the same person :-) > > I think it would be good to remind user of aggre

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-15 Thread anil gupta
Hi Ted, I decompiled the hbase-0.92.0-cdh4b1.jar using JD-GUI and in validateParameter method i don't find that condition. Thanks, Anil On Tue, May 15, 2012 at 1:37 PM, Ted Yu wrote: > I checked the code in Apache HBase 0.92 and trunk. I see the following line > in validat

Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?

2012-05-15 Thread anil gupta
g/apache/hbase/hbase/0.92.0-cdh4b1-SNAPSHOT/ > > On Tue, May 15, 2012 at 4:58 PM, anil gupta wrote: > > > Hi Ted, > > > > I decompiled the hbase-0.92.0-cdh4b1.jar using JD-GUI and in > > validateParameter method i don't find that condition. > > > > T

  1   2   3   4   5   >