Re: HConnection.getTable behavior on non existent table

2014-06-25 Thread Nitin Pawar
from this
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnection.html

it still shows it throws IOException


On Wed, Jun 25, 2014 at 6:48 PM, Anand Nalya anand.na...@gmail.com wrote:

 Hi,

 With HBase 0.96, HConnection.getTable method used to throw an exception in
 case the table did not exist. Based on this exection, I was creating tables
 in HBase as required.

 With HBase 0.98.3 I'm not getting exception. Is this the expected behavior
 or am I missing something.

 Thanks,
 Anand




-- 
Nitin Pawar


Re: Online/Realtime query with filter and join?

2013-11-29 Thread Nitin Pawar
whats the size of data are you looking at?
100ms for a join statement for having substancial data ...that would be
tricky
On 29 Nov 2013 16:03, Ramon Wang ra...@appannie.com wrote:

 The general performance requirement for each query is less than 100 ms,
 that's the average level. Sounds crazy, but yes we need to find a way for
 it.

 Thanks
 Ramon


 On Fri, Nov 29, 2013 at 5:01 PM, yonghu yongyong...@gmail.com wrote:

  The question is what you mean of real-time. What is your performance
  request? In my opinion, I don't think the MapReduce is suitable for the
  real time data processing.
 
 
  On Fri, Nov 29, 2013 at 9:55 AM, Azuryy Yu azury...@gmail.com wrote:
 
   you can try phoniex.
On 2013-11-29 3:44 PM, Ramon Wang ra...@appannie.com wrote:
  
Hi Folks
   
It seems to be impossible, but I still want to check if there is a
 way
  we
can do complex query on HBase with Order By, JOIN.. etc like we
   have
with normal RDBMS, we are asked to provided such a solution for it,
 any
ideas? Thanks for your help.
   
BTW, i think maybe impala from CDH would be a way to go, but haven't
  got
time to check it yet.
   
Thanks
Ramon
   
  
 



Re: How to join 2 tables using hadoop?

2013-07-19 Thread Nitin Pawar
Try hive with hbase storage handler


On Fri, Jul 19, 2013 at 9:54 AM, Pavan Sudheendra pavan0...@gmail.comwrote:

 Hi,

 I know that HBase by default doesn't support table joins like RDBMS..
 But anyway, I have a table who value contains a json with a particular
 ID in it..
 This id references another table where it is a key..

 I want to fetch the id first from table A , query table 2 and get its
 corresponding value..

 What is the best way of achieving this using the MR framework?
 Apologizes, i'm still new to Hadoop and HBase so please go easy on me.

 Thanks for any help

 --
 Regards-
 Pavan




-- 
Nitin Pawar


Re: MapReduce: Reducers partitions.

2013-04-10 Thread Nitin Pawar
I hope i understood what you are asking is this . If not then pardon me :)
from the hadoop developer handbook few lines

The*Partitioner* class determines which partition a given (key, value) pair
will go to. The default partitioner computes a hash value for the key and
assigns the partition based on this result. It garantees that all the
records mapping to one key go to same reducer

You can write your custom partitioner as well
here is the link :
http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning




On Wed, Apr 10, 2013 at 6:19 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi,

 quick question. How are the data from the map tasks partitionned for
 the reducers?

 If there is 1 reducer, it's easy, but if there is more, are all they
 same keys garanteed to end on the same reducer? Or not necessary?  If
 they are not, how can we provide a partionning function?

 Thanks,

 JM




-- 
Nitin Pawar


Re: MapReduce: Reducers partitions.

2013-04-10 Thread Nitin Pawar
To add what Ted said,

the same discussion happened on the question Jean asked

https://issues.apache.org/jira/browse/HBASE-1287


On Wed, Apr 10, 2013 at 7:28 PM, Ted Yu yuzhih...@gmail.com wrote:

 Jean-Marc:
 Take a look at HRegionPartitioner which is in both mapred and mapreduce
 packages:

  * This is used to partition the output keys into groups of keys.

  * Keys are grouped according to the regions that currently exist

  * so that each reducer fills a single region so load is distributed.

 Cheers

 On Wed, Apr 10, 2013 at 6:54 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

  Hi Nitin,
 
  You got my question correctly.
 
  However, I'm wondering how it's working when it's done into HBase. Do
  we have defaults partionners so we have the same garantee that records
  mapping to one key go to the same reducer. Or do we have to implement
  this one our own.
 
  JM
 
  2013/4/10 Nitin Pawar nitinpawar...@gmail.com:
   I hope i understood what you are asking is this . If not then pardon me
  :)
   from the hadoop developer handbook few lines
  
   The*Partitioner* class determines which partition a given (key, value)
  pair
   will go to. The default partitioner computes a hash value for the key
 and
   assigns the partition based on this result. It garantees that all the
   records mapping to one key go to same reducer
  
   You can write your custom partitioner as well
   here is the link :
   http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning
  
  
  
  
   On Wed, Apr 10, 2013 at 6:19 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Hi,
  
   quick question. How are the data from the map tasks partitionned for
   the reducers?
  
   If there is 1 reducer, it's easy, but if there is more, are all they
   same keys garanteed to end on the same reducer? Or not necessary?  If
   they are not, how can we provide a partionning function?
  
   Thanks,
  
   JM
  
  
  
  
   --
   Nitin Pawar
 




-- 
Nitin Pawar


Re: Hbase as mongodb

2013-01-16 Thread Nitin Pawar
may be this will help
http://sites.ieee.org/scv-cs/files/2011/03/IBM-Jaql-by-Kevin-Beyer.pdf


On Wed, Jan 16, 2013 at 4:03 PM, Panshul Whisper ouchwhis...@gmail.comwrote:

 Hello,

 Is it possible to use hbase to query json documents in a same way as we can
 do with Mongodb

 Suggestions please.
 If we can then a small example as how.. not the query but the process
 flow..
 Thanku so much
 Regards,
 Panshul.




-- 
Nitin Pawar


Re: Hive and Hbase performance

2012-11-19 Thread Nitin Pawar
Dalia,

by adding a node I hope on hive side you want to increase the hadoop
cluster size.
in general performance is directly related to how much data you have vs how
much computation power you have. More the data better to have larger
computational power.

I am not guru of hbase but following link in details explains about hbase
performance benchmarks with multiple options in view
http://hbase.apache.org/book/casestudies.perftroub.html


On Mon, Nov 19, 2012 at 2:18 PM, Dalia Sobhy dalia.mohso...@hotmail.comwrote:


 Hi Anil,

 I thought so, but when checking this blog I got confused, Could u check it
 and give me your feedback?


 http://stackoverflow.com/questions/9300840/is-running-scan-on-hbase-faster-if-running-hbase-on-more-than-1-machine?rq=1

 Thanks,

  From: anilgupt...@gmail.com
  Date: Sat, 17 Nov 2012 16:51:31 -0800
  Subject: Re: Hive and Hbase performance
  To: user@hbase.apache.org
 
  Hi Dalia,
 
  The usual and short answer is Yes. Both, HBase and Hive will provide
 better
  performance on adding more nodes since they provide horizontal
 scalability.
 
  HTH,
  Anil
 
 
 
  On Sat, Nov 17, 2012 at 4:02 PM, Dalia Sobhy dalia.mohso...@hotmail.com
 wrote:
 
 
  I want to ask if  a Hive Count or Scan Query will provide better
  performance when adding more nodes ??
 
  if an Hbase Count or Scan Query will provide better performance when
  adding more nodes ??
 
 
 
 
  --
  Thanks  Regards,
  Anil Gupta


 Sent from my iPad

 On Nov 18, 2012, at 3:19 AM, Dalia Sobhy dalia.mohso...@hotmail.com
 wrote:

 
  Hi Anil,
 
  I thought so, but when checking this blog I got confused, Could u check
 it and give me your feedback?
 
 
 http://stackoverflow.com/questions/9300840/is-running-scan-on-hbase-faster-if-running-hbase-on-more-than-1-machine?rq=1
 
  Thanks,
 
  From: anilgupt...@gmail.com
  Date: Sat, 17 Nov 2012 16:51:31 -0800
  Subject: Re: Hive and Hbase performance
  To: user@hbase.apache.org
 
  Hi Dalia,
 
  The usual and short answer is Yes. Both, HBase and Hive will provide
 better
  performance on adding more nodes since they provide horizontal
 scalability.
 
  HTH,
  Anil
 
 
 
  On Sat, Nov 17, 2012 at 4:02 PM, Dalia Sobhy 
 dalia.mohso...@hotmail.comwrote:
 
 
  I want to ask if  a Hive Count or Scan Query will provide better
  performance when adding more nodes ??
 
  if an Hbase Count or Scan Query will provide better performance when
  adding more nodes ??
 
 
 
 
  --
  Thanks  Regards,
  Anil Gupta
 




-- 
Nitin Pawar


RE: HBase table - distinct values

2012-10-11 Thread Nitin Pawar
You may try define a hive table with hbase storage handler n then query it
..though response time will be slow based on how much data you have
On Oct 11, 2012 4:19 PM, raviprasa...@polarisft.com wrote:

 Hi Anoop,
  Thanks a lot for your reply,
   Actually our requirment is just to count the distinct  deptno from  emp
 ( Hbase table),  We are running various pentaho jobs and we need to test
 the validity of the results, for that we need the below query.

 We need a query to select distinct deptno  from   emp  Hbase table.

 Example :-
   HBase Table name :-  emp,  column_family :=  cf
Let us say  deptno is the field in the column family cf

 emp
 
 deptno
 10
 20
 30
 10
 10
 10

 The Result should be
 The  count (distinct deptno)  =  3

 We  need just the query to  know the count of  distinct  deptno .

 Thanks
 Regards
 Raviprasad. T
 Mobile :-  91- 9894769541


 -Anoop Sam John anoo...@huawei.com wrote: -
 To: user@hbase.apache.org user@hbase.apache.org
 From: Anoop Sam John anoo...@huawei.com
 Date: 10/11/2012 09:33AM
 Cc: hbase-u...@hadoop.apache.org hbase-u...@hadoop.apache.org
 Subject: RE: HBase table - distinct values

 Hi Ravi
  If dept_no is a CF:qualifier, to know all the dept numbers
 (distinct or not) you need a full table scan. As Doug said if it is a
 frequent online query don't think MR is a good choice..  If the data in
 your emp table is huge a full table scan also wont be that good I feel
  Can you guys think about storing dept number in another table?  If you
 people need query like select empdetails from emp where dept_no=?  (this
 query also)  you can think about creating secondary index implementation
 and indexing dept_no... You can use index table for above query as well as
 the 1st one you asked about.  :)

 -Anoop-
 
 From: raviprasa...@polarisft.com [raviprasa...@polarisft.com]
 Sent: Wednesday, October 10, 2012 7:51 PM
 To: user@hbase.apache.org
 Cc: hbase-u...@hadoop.apache.org; user@hbase.apache.org
 Subject: RE: HBase table - distinct values

 Hi,
   Hbase table name  :- emp
   Column family :- cf
 Under the column family cf  we will be having the field name  deptno


 Regards
 Raviprasad. T
 Mobile :-  91- 9894769541


 -Anoop Sam John anoo...@huawei.com wrote: -
 To: user@hbase.apache.org user@hbase.apache.org, 
 hbase-u...@hadoop.apache.org hbase-u...@hadoop.apache.org
 From: Anoop Sam John anoo...@huawei.com
 Date: 10/10/2012 06:18PM
 Subject: RE: HBase table - distinct values

 Hi
 Your schema? 'deptno'  is a cf:qualifier?

 -Anoop-
 
 From: raviprasa...@polarisft.com [raviprasa...@polarisft.com]
 Sent: Wednesday, October 10, 2012 4:29 PM
 To: user@hbase.apache.org; hbase-u...@hadoop.apache.org
 Subject: HBase table - distinct values

 Hi all,
   Is it possible to select distinct value from Hbase table.

 Example :-
what is the equivalant code for the below Oracle code  in Hbase  ?

   Select count (distinct deptno) from emp ;

 Regards
 Raviprasad. T


 This e-Mail may contain proprietary and confidential information and is
 sent for the intended recipient(s) only.  If by an addressing or
 transmission error this mail has been misdirected to you, you are requested
 to delete this mail immediately. You are also hereby notified that any use,
 any form of reproduction, dissemination, copying, disclosure, modification,
 distribution and/or publication of this e-mail message, contents or its
 attachment other than by its intended recipient/s is strictly prohibited.

 Visit us at http://www.polarisFT.com


 This e-Mail may contain proprietary and confidential information and is
 sent for the intended recipient(s) only.  If by an addressing or
 transmission error this mail has been misdirected to you, you are requested
 to delete this mail immediately. You are also hereby notified that any use,
 any form of reproduction, dissemination, copying, disclosure, modification,
 distribution and/or publication of this e-mail message, contents or its
 attachment other than by its intended recipient/s is strictly prohibited.

 Visit us at http://www.polarisFT.com


 This e-Mail may contain proprietary and confidential information and is
 sent for the intended recipient(s) only.  If by an addressing or
 transmission error this mail has been misdirected to you, you are requested
 to delete this mail immediately. You are also hereby notified that any use,
 any form of reproduction, dissemination, copying, disclosure, modification,
 distribution and/or publication of this e-mail message, contents or its
 attachment other than by its intended recipient/s is strictly prohibited.

 Visit us at http://www.polarisFT.com



Re: Any group like this for pentaho?

2012-09-11 Thread Nitin Pawar
not sure about pentaho group but there is active pentaho irc channel
which runs 24x7 mostly

#pentaho is the irc room

On Tue, Sep 11, 2012 at 5:15 PM, Ramasubramanian
ramasubramanian.naraya...@gmail.com wrote:
 Hi,

 Like this hbase group, do we have such group for pentaho too? If so pls share 
 the mail drop.

 Regards,
 Rams



-- 
Nitin Pawar


Re: hbase security

2012-05-15 Thread Nitin Pawar
you can use the hadoop + kerberos  security feature to have security at
hadoop level

similarly, you can edit hbase-site.xml to have kerberos authentications.

for more you can refer:
https://ccp.cloudera.com/display/CDHDOC/HBase+Security+Configuration

On Tue, May 15, 2012 at 8:11 AM, Rita rmorgan...@gmail.com wrote:

 Hello,

 It seems for my hbase installation anyone can delete my tables. Is there a
 way to prevent this? I would like to have only owner of Hmaster with super
 authority.

 tia

 --
 --- Get your facts first, then you can distort them as you please.--




-- 
Nitin Pawar


Re: Teradata Procedure handling in hadoop

2012-04-30 Thread Nitin Pawar
with query approach you can use hive

1) define your table with  hbase storage handler using hive
2) then you can write a hive query such

insert overwrite table A select blah blah from  B b join  C c on (b.col =
c.col) join on D d ( c.col2 = D.col1) where conditions as you want





On Mon, Apr 30, 2012 at 2:32 PM, Subasis chatterji.suvas...@gmail.comwrote:


 I have a Teradata procedure which inserts Data to Table A , Selecting data
 from Table B using joins with Table C , Table D and other Filter conditions
 at the column label of Table C  and  Table D.

 Now , How can I handle such scenario in Hadoop, to have Table A as my
 resultset output in HDFS.
 I need help.
 --
 View this message in context:
 http://old.nabble.com/Teradata-Procedure-handling-in-hadoop-tp33763290p33763290.html
 Sent from the HBase User mailing list archive at Nabble.com.




-- 
Nitin Pawar


Re: hbase installation

2012-04-25 Thread Nitin Pawar
any error msg?

On Wed, Apr 25, 2012 at 7:02 PM, shehreen shehreen_cute...@hotmail.comwrote:


 Hi

 am new to hbase and hadoop. I want to install hbase and to work with hbase
 writing mapreduce jobs for data in hbase. I installed hbase. It works well
 in standalone mode but dont start master and zookeeper properly on
 pseudodistributed mode.

 kindly help to resolve this problem.

 Thanks
 --
 View this message in context:
 http://old.nabble.com/hbase-installation-tp33746422p33746422.html
 Sent from the HBase User mailing list archive at Nabble.com.




-- 
Nitin Pawar


Re: How to build hbase with snappy

2012-04-24 Thread Nitin Pawar
Not sure why its missing from maven repository

download it from http://code.google.com/p/hadoop-snappy/

it has steps for build and then run the hbase build

On Tue, Apr 24, 2012 at 5:49 PM, shixing paradise...@gmail.com wrote:

 Is there anybody can tell me?

 My version is hbase-0.92.0

 On Mon, Apr 23, 2012 at 6:33 PM, shixing paradise...@gmail.com wrote:

  I run this command:
  mvn clean -DskipTests install assembly:single -Dsnappy
 
  it shows:
 
  Downloading:
 
 http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-snappy/0.0.1-SNAPSHOT/hadoop-snappy-0.0.1-SNAPSHOT.pom
  Downloading:
 
 http://repository.apache.org/snapshots/org/apache/hadoop/hadoop-snappy/0.0.1-SNAPSHOT/hadoop-snappy-0.0.1-SNAPSHOT.pom
  [WARNING] The POM for org.apache.hadoop:hadoop-snappy:jar:0.0.1-SNAPSHOT
  is missing, no dependency information available
  Downloading:
 
 https://repository.apache.org/content/repositories/releases/org/apache/hadoop/hadoop-snappy/0.0.1-SNAPSHOT/hadoop-snappy-0.0.1-SNAPSHOT.jar
  Downloading:
 
 http://people.apache.org/~garyh/mvn/org/apache/hadoop/hadoop-snappy/0.0.1-SNAPSHOT/hadoop-snappy-0.0.1-SNAPSHOT.jar
  Downloading:
 
 http://repository.apache.org/snapshots/org/apache/hadoop/hadoop-snappy/0.0.1-SNAPSHOT/hadoop-snappy-0.0.1-SNAPSHOT.jar
  [INFO]
  
  [INFO] BUILD FAILURE
  [INFO]
  
  [INFO] Total time: 16.636s
  [INFO] Finished at: Mon Apr 23 18:24:47 CST 2012
  [INFO] Final Memory: 10M/360M
  [INFO]
  
  [ERROR] Failed to execute goal on project hbase: Could not resolve
  dependencies for project org.apache.hbase:hbase:jar:0.92.1: Could not
 find
  artifact org.apache.hadoop:hadoop-snappy:jar:0.0.1-SNAPSHOT in apache
  release (https://repository.apache.org/content/repositories/releases/)
 -
  [Help 1]
  [ERROR]
  [ERROR] To see the full stack trace of the errors, re-run Maven with the
  -e switch.
  [ERROR] Re-run Maven using the -X switch to enable full debug logging.
  [ERROR]
  [ERROR] For more information about the errors and possible solutions,
  please read the following articles:
  [ERROR] [Help 1]
 
 http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
 
 
  --
  Best wishes!
  My Friend~
 



 --
 Best wishes!
 My Friend~




-- 
Nitin Pawar


Re: Doumentation broken

2012-04-13 Thread Nitin Pawar
Thanks for quick reply Oliver

~nitin

On Fri, Apr 13, 2012 at 1:32 PM, Oliver Meyn (GBIF) om...@gbif.org wrote:

 Looks like /book got moved under another /book, so something is definitely
 wrong.  You can try an unstyled version at:

 http://hbase.apache.org/book/book/book.html

 Cheers,
 Oliver

 On 2012-04-13, at 9:59 AM, Nitin Pawar wrote:

  Hello,
 
  Is there any maintenance going on with hbase.apache.org?
 
  All the links (ex:
  http://hbase.apache.org/book/architecture.html#arch.overview) are return
  404 NOT FOUND
 
  Thanks,
  Nitin Pawar





-- 
Nitin Pawar