Vaibhav
By caching do you mean storing storing all the rows in a HashMap in the
memory , so that u can access that map repeatedly instead of Disk IOs ?
Thanks
On Wed, Aug 12, 2009 at 4:53 AM, Vaibhav Puranik wrote:
> Amandeep,
>
> We are caching Hbase results in memory (in a HashMap).
>
> Rega
Hi all,
I have seen a lot of discussion about map side join in both hadoop and hbase
forums .. Can anyone explain or send me a link pointing to the algorithm or
code so that i can use it in my application if it is efficient (Preferably
for hbase table join). If i can understand the algorithm for h
Hi all ,
I have gone through the IndexedTableAdmin classes in Hbase 0.19.3 API .. I
have seen some methods used to create an Indexed Table (on some column).. I
have some doubts regarding the same ...
1) Are these somewhat similar to Hash indexes(in RDBMS) where i can easily
lookup a column value
;
> As far as I understand you are talking about the secondary indexes. Yes,
> they can be used to quickly get the rowkey by a value in the indexed column.
>
> --Kirill
>
>
> bharath vissapragada wrote:
>
>> Hi all ,
>>
>> I have gone through the Indexe
y faster than a full table
> scan.
>
> If you need hash-level performance on the index lookup, there are lots of
> solutions outside of HBase that would work... In-memory Java HashMap, Tokyo
> Cabinet on-disk HashMaps, BerkeleyDB, etc... If you need full-text indexing,
> you can
e index hits doesn't really make sense.
>
> Clint? You out there? :)
>
> JG
>
>
> bharath vissapragada wrote:
>
>> I got it ... I think this is definitely useful in my app because iam
>> performing a full table scan everytime for selecting the rowkeys based on
te:
> > I'm actually unsure about that. Look at the code or experiment.
> >
> > Seems to me that there would be a uniqueness requirement, otherwise what
> do
> > you expect the behavior to be? A get can only return a single row, so
> > multiple index hits doesn
ndividual columns). I'm using the same approach for one access
> pattern and so far it seems to work very well.
>
> But as far as I know the built in secondary indexing assumes 1
> secondary index table row -> 1 original table row.
>
> Sorry if this got a bit long-w
Hi all ,
Can any one tell me where i can access some docs which gives a good
explanation about how a map-reduce scheduler in Hbase
i.e., How Map regions are created (to minimize data flow through network)
and how the reduce phases are performed so that we can minimize
the flow of keys and values a
Aamandeep , Gray and Purtell thanks for your replies .. I have found them
very useful.
You said to increase the number of reduce tasks . Suppose the number of
reduce tasks is more than number of distinct map output keys , some of the
reduce processes may go waste ? is that the case?
Also I have
ge files. Keeping the data transfer local in
> this case results in lower performance.
>
> If you want max local speed, I suggest looking at CUDA.
>
>
> On Thu, Aug 20, 2009 at 9:09 PM, bharath
> vissapragada wrote:
> > Aamandeep , Gray and Purtell thanks for your repli
I recall, and became disk
> i/o bound (as one would expect/hope).
>
> For a majority of use cases, it doesn't matter in a significant way at all.
> But I have seen it make a measurable difference for some.
>
> JG
>
>
> bharath vissapragada wrote:
>
>> Thank
You could potentially be network i/o
> bound.
>
> It should be very easy to test how your own jobs run on your own cluster
> using Ganglia and hadoop/mr logging/output.
>
>
> bharath vissapragada wrote:
>
>> JG
>>
>> Can you please elaborate on the la
hi ,
I saw the tableindexed package here
http://people.apache.org/~stack/hbase-0.20.0-candidate-1/docs/api/org/apache/hadoop/hbase/client/tableindexed/package-summary.html
I have a doubt ...
Suppose i have the following tab;e
rowkey col
1a
2a
3b
4
Hi all ,
iam new to this forum and don't no whether this is the right place to post
my doubts,
i have a problem in accessing the svn "
http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/examples/uploaders/";
... it says "couldn't connect to svn.." ... can anyone please tell me if
the svn is
Hi all,
Im new to hbase API .. can anyone tell me how to add a "row" to an existing
hbase table .
I saw Batchupdate class which only modifies existing "rows". i also checked
out Htable and HBaseAdmin class and im clueless.
kindly bear my doubt and please reply. Its kinda urgent.
Thanks in advanc
docs for more
> information.
>
> JG
>
>
> bharath vissapragada wrote:
>
>> Hi all,
>>
>> Im new to hbase API .. can anyone tell me how to add a "row" to an
>> existing
>> hbase table .
>> I saw Batchupdate class which only modifies exis
Wed, Jul 8, 2009 at 9:40 AM, stack wrote:
> Yes. See example code here:
>
> http://hadoop.apache.org/hbase/docs/r0.19.3/api/overview-summary.html#overview_description
> St.Ack
>
> On Tue, Jul 7, 2009 at 9:02 PM, bharath vissapragada <
> bharathvissapragada1...@gmail.com&
Hi all,
I have written a hbase program in java and when i try to run it on hbase it
gives an error (i've pasted it below). I m trying to run it locally i.e, no
other systems are involved ,hbase runs only on my machine . I have just set
the "JAVA_HOME" variable in "hbase-env.sh" and then i have iss
Hi all,
I Have run a java program and it created a table named "users" . I have
made some modifications to the code and i want to run it again . So i tried
to delete the existing table using "hbase shell" . When i tried to issue the
command "drop 'users'" it said "Disable the table first".. Then
Hi all ,
I want to join(similar to relational databases join) two tables in HBase .
Can anyone tell me whether it is already implemented in the source !
Thanks in Advance
, 2009 at 10:30 AM, Ryan Rawson wrote:
> HBase != SQL.
>
> You might want map reduce or cascading.
>
> On Tue, Jul 14, 2009 at 9:56 PM, bharath
> vissapragada wrote:
> > Hi all ,
> >
> > I want to join(similar to relational databases join) two tables in HBase
>
bs. That works best.
> Not tried cascading yet.
>
> -ak
>
> On 7/14/09, bharath vissapragada
> wrote:
> > Thats fine .. I know that hbase has completely different usage compared
> to
> > SQL .. But for my application there is some kind of dependency involved
> &
ase, figuring out how you want to pull your data out is
> key to how you want to put the data in.
>
> JG
>
>
> bharath vissapragada wrote:
>
>> Amandeep , can you tell me what kinds of joins u have implemented ? and
>> which works the best (based on observation ).. Can
e join, so I'm not
> sure how helpful it would be. If you give some detail perhaps I can provide
> some pseudo-code for you.
>
> JG
>
>
> bharath vissapragada wrote:
>
>> JG thanks for ur reply,
>>
>> Actually iam trying to implement a realtime join of
t; RDBMS with a smallish table will be quite fast at simple index-based
>>> joins.
>>> I would guess that unloaded, single machine performance of this join
>>> operation would be much faster in an RDBMS.
>>>
>>> But if your table has millions or billi
Hi all ,
Sorry if this question has already been discussed ..
I want to extract the all the column-names of a table . Can anyone tell me
the corresponding function .
I have googled it but none of the results gave me the correct answer .
Thanks in advance
t; If you're not working in Java, please let us know how you are
> interacting; REST, terminal etc
>
> Cheers,
> Tim
>
>
> On Sun, Jul 19, 2009 at 11:26 AM, bharath
> vissapragada wrote:
> > Hi all ,
> >
> > Sorry if this question has already been discusse
Hi all,
Generally TableMap.initJob() method takes a "table name" as input to the
map while using map-reduce in HBase .
Is there a way so that i can use more than 1 table , i.e., input to the map
contains more than 1 table ,
Thanks
ray wrote:
> Currently, there is not.
>
> You would have multiple MR jobs, or you would directly use the API in your
> job to pull from multiple tables.
>
> I suppose it would be feasible, but as it is now, you are not told which
> table your Result comes from.
>
>
>
as you'd like mappers, use TextInputFormat and just ignore
> the
> input or trigger what the particular mapper does off passed input.
>
> St.Ack
>
>
> On Tue, Jul 21, 2009 at 8:13 PM, bharath vissapragada <
> bharathvissapragada1...@gmail.com> wrote:
>
> &
Hi all,
I have one simple doubt in hbase ,
Suppose i use a scanner to iterate through all the rows in the hbase and
process the data in the table corresponding to those rows .Is the processing
of that data done locally on the region server in which that particular
region is located or is it trans
on the same node. If thats not possible, its done on
> the
> >> same rack.
> >>
> >> -ak
> >>
> >>
> >> On Tue, Jul 21, 2009 at 9:43 PM, bharath vissapragada<
> >> bhara...@students.iiit.ac.in> wrote:
> >>
> >
ed databases), you would have to
> fetch data to where computation has to be performed. The whole MR design
> philosophy is to take the code to the data and execute it as close to where
> the data is stored as possible.
>
>
> On Tue, Jul 21, 2009 at 11:48 PM, bharath vissapragada &
That means we have to stick to the principle of MR whenever we require
efficient data processing ..
but map reduce cannot offer solutions to gnrl database problems i guess!
On Wed, Jul 22, 2009 at 12:34 PM, Amandeep Khurana wrote:
> On Wed, Jul 22, 2009 at 12:01 AM, bharath vissaprag
Hi all when i use TableMap class i get the following errors
no interface expected here
public class MR_DS_Scan_Case1 extends TableMap implements Tool {
^
and
cannot find symbol
symbol : method
run(org.apache.hadoop.hbase.HBaseConfiguration,MR_DS_Sca
Erik thanks for ur reply ..
Actually i want to compute something using the information from both the
tables .
I wanted to implement a simple join of two HBase tables and jus check how it
performs.
On Wed, Jul 22, 2009 at 10:04 PM, Erik Holstad wrote:
> Hi Bharath!
> Yeah, that is what Jonathan m
I have one more doubt .
In the example given in the site
http://wiki.apache.org/hadoop/Hbase/MapReduce
Some codes are written in such a manner that they have only Map classes ..
and no reduce classes
What i understood is that a MAP is generated for every regionserver and it
operates on the data p
Hi all ,
I am new to HBase MR programming . Though i have already used MR on hadoop
.. iam facing some difficulties in running my HBase MR job . Details are as
follows
I have written a file sample.java
I want to run only Map phase and no reduce phase and iam using
IdentityTableMp to take the inp
Actually im running these MR programs in local stand alone mode ... Is this
any issue?
On Wed, Jul 22, 2009 at 11:49 PM, Erik Holstad wrote:
> Hi Bharath!
> Did you have a look at http://"your_machine":50030/jobtracker.jsp
> Should give you the logs, or output that you are looking for.
>
> Regard
:44 AM, stack wrote:
> No. Should work.
> St.Ack
>
> On Wed, Jul 22, 2009 at 8:35 PM, bharath vissapragada <
> bharathvissapragada1...@gmail.com> wrote:
>
> > Actually im running these MR programs in local stand alone mode ... Is
> this
> > any issue?
> &g
Hi all ,
I wanted to run HBase in standalone mode to check my Hbase MR programs ... I
have dl a built version of hbase-0.20. and i have hadoop 0.19.3
"I have set JAVA_HOME in both of them" .. then i started hbase and inserted
some tables using JAVA API .. Now i have written some MR programs onHBa
since i haven;t started the cluster .. i can even see the details in
"localhost:/jobTracker.jsp" .. i didn't even add anything to
hadoop/conf/hadoop-site.xml
On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
bhara...@students.iiit.ac.in> wrote:
> Hi all ,
>
d.JobClient: Reduce input records=8
On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
bhara...@students.iiit.ac.in> wrote:
> since i haven;t started the cluster .. i can even see the details in
> "localhost:/jobTracker.jsp" .. i didn't even add anything to
Reduce input groups=8
> > 09/07/23 23:25:38 INFO mapred.JobClient: Combine output records=0
> > 09/07/23 23:25:38 INFO mapred.JobClient: Map input records=8
> > 09/07/23 23:25:38 INFO mapred.JobClient: Reduce output records=8
> > 09/07/23 23:25:38 INFO
I doubt there is some problem in tmy hbase or hadoop conf .. can u tell me
any link or explaination on MR in Hbase in standalone mode ... please its
kinda urgent!
On Thu, Jul 23, 2009 at 8:43 PM, bharath vissapragada <
bharathvissapragada1...@gmail.com> wrote:
> Thanks for ur reply J
n your host:50030 web UI. So use
> apache common logging and you should see your output.
>
> J-D
>
> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> vissapragada wrote:
> > Thanks for ur reply J-D ... Im pasting some part of the code ...
> >
> > Im doing it
r web UI which is in no use since you run local jobs directly
> on HBase.
>
> J-D
>
> On Thu, Jul 23, 2009 at 11:41 AM, bharath
> vissapragada wrote:
> > I used stdout for debugging while writing codes in hadoop MR programs and
> it
> > worked fine ...
> > Can y
One simple thing to do is to do your
> debugging with a logger so you are sure to see your output as I
> already proposed. Another simple thing is to get a pseudo-distributed
> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
> sure you did before.
>
> J-
I have set "c.setOutputFormat(NullOutputFormat.class);" otherwise its
showing the error
"Output directory not set in JobConf."
I think this is causing troubles ... any idea?
On Thu, Jul 23, 2009 at 10:12 PM, bharath vissapragada <
bharathvissapragada1...@gmail.com&
ector output, Reporter reporter) throws IOException {
>
> I should read map not mapp
>
> J-D
>
> On Thu, Jul 23, 2009 at 12:42 PM, bharath
> vissapragada wrote:
> > I have tried apache -commons logging ...
> >
> > instead of printing the row ... i have written log
ge-summary.html
>
> J-D
>
> On Thu, Jul 23, 2009 at 1:04 PM, bharath
> vissapragada wrote:
> > I think this is the problem .. but when i changed it .. it gave me a
> weird
> > error
> >
> > name clash:
> >
> map(org.apache.hadoop.hb
I figured out my error and it is in the OutputCollector ... I really thank
you J-D for replying me constantly .. and i am very very sorry for wasting
your time so much ...
Thanks for ur help
On Thu, Jul 23, 2009 at 10:58 PM, bharath vissapragada <
bharathvissapragada1...@gmail.com> wrote:
Hi all,
I wanted to implement TableMap interface so that "map" function can emit
.. I wrote the code as follows
---
package org.apache.hadoop.hbase.mapred;
import java.io.IOException;
import org.ap
Thanks it worked fine .. Do i need to update hbase-x.x.jar ?? or is there
some other procedure to use it in my program ... because when i updated
hbase-x.x.jar .. it gave me NoClassDefFoundError while running my program!!
2009/7/24 Doğacan Güney
> n Jul 24, 2009, at 1:00 PM, bhar
;>
> >> J-D
> >>
> >> 2009/7/24 john smith :
> >> > Yes i too have the same problem .. Can anyone tell me in detail how to
> >> > add new classes to the existing hbase jar or do we have a different
> >> > method to include our own cl
u should put your jar either in the lib folder or in the HBase
> classpath in conf/hbase-env.sh if you want to keep it somewhere else.
>
> Hope this helps,
>
> J-D
>
> On Fri, Jul 24, 2009 at 9:29 AM, bharath
> vissapragada wrote:
> > Actually u might have seen that i have impl
Hi all,
i have implemented TableMap interface succesfully to emit pairs
.. so now i must implement TableReduce interface to receive those
pairs correspondingly ... Is the following code correct
public class MyTableReduce
extends MapReduceBase
implements TableReduce {
public void reduce(Text
Hi all ,
Is there a way , i can get "Scanner" to the part of the table in a specific
region server using it's "HOST NAME" ...
Eg : suppose i have a table "A" and one of the region servers have HOSTNAME
"region1"
Can i get a scanner to all those rows of table "A" in "region1"..
Thanks
ooking for the regions of interest, then build scanners on the
> specific ranges of each region.
>
> Good luck!
>
> On Sat, Jul 25, 2009 at 2:42 AM, bharath
> vissapragada wrote:
> > Hi all ,
> >
> > Is there a way , i can get "Scanner" to the part of th
ed that the value of
info:regioninfo is not a string whereas for others it is a string!
any comments?
On Sat, Jul 25, 2009 at 9:38 PM, stack wrote:
> Just do a new Table(".META.") and scan it as you would any other table.
>
> Whats the error?
>
> St.Ack
>
> On
See the example on the bottom of the page ..
http://hadoop.apache.org/hbase/docs/r0.19.3/api/overview-summary.html
It is very well documented ... and u can easily understand!
On Fri, Jul 31, 2009 at 2:26 PM, Ninad Raut wrote:
> Use a Scanner instead of a get
>
> 2009/7/31
>
> > As to your
Hi all,
I have hbase 0.19.3 version ... On what versions of hadoop apart frm 0.19.x
can i run this hbase version ...
is it 0.20.x or 0.18.x ..
Thanks in advance
cs
> >
> http://hadoop.apache.org/hbase/docs/r0.19.3/api/overview-summary.html#overview_description
> >
> > Requirements
> >- Java 1.6.x, preferably from Sun.
> >- Hadoop 0.19.x. This version of HBase will only run on this
> > version of Hadoop.
> > .
64 matches
Mail list logo