Re: Using HBase on other file systems

2010-05-09 Thread Amandeep Khurana
a raw filesystem (file:///mount/filesystem) or you'll need to extend the FileSystem class to write a client that Hadoop Core can use. -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sun, May 9, 2010 at 9:44 AM, Andrew Purtell wrote: > O

Re: How does HBase perform load balancing?

2010-05-08 Thread Amandeep Khurana
The Yahoo! research link is the most recent one afaik... Thats the one submitted to SOCC'10 On Sat, May 8, 2010 at 3:36 AM, Kevin Apte wrote: > Are these the good links for the Yahoo Benchmarks? > http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf > http://research.*yahoo*.com/files/ycsb.pdf > >

Re: Does HBase do in-memory replication of rows?

2010-05-08 Thread Amandeep Khurana
HBase does not do in-memory replication. Your data goes into a region, which has only one instance. Writes go to the write ahead log first, which is written to the disk. However, since HDFS doesnt yet have a fully performing flush functionality, there is a chance of losing the chunk of data. The ne

Re: Theoretical question...

2010-04-29 Thread Amandeep Khurana
tasks across all 100 nodes). Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Thu, Apr 29, 2010 at 1:39 PM, Edward Capriolo wrote: > On Thu, Apr 29, 2010 at 4:31 PM, Michael Segel >wrote: > > > > > Imagine you have a cloud of 100

Re: Looking for advise on Hbase setup

2010-04-25 Thread Amandeep Khurana
int. Is there a current project in the > works for a graph db on hbase? I would love to help out. > > Aaron > > > On Sun, Apr 25, 2010 at 3:48 PM, Amandeep Khurana > wrote: > > > If you want to serve some application off hbase, you might be better > > off with

Re: Looking for advise on Hbase setup

2010-04-25 Thread Amandeep Khurana
data from hbase with some batch processing in hive (small >amount of data) >- Build a large graph db on top of hbase (size unknown, billions at >least) >- Probably a lot more things as time goes along > > Thoughts and opinions welcome. Thanks! > > Aaron > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: Issue reading consistently from an hbase test client app

2010-04-16 Thread Amandeep Khurana
What version of HBase are you on? Did you see anything out of place in the master or regionserver logs? This should be happening...! Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Fri, Apr 16, 2010 at 10:27 AM, Charles Glommen wrote: > Fo

Re: Porting SQL DB into HBASE

2010-04-12 Thread Amandeep Khurana
have no SQL available. Make sure you are aware of the trade-offs between HBase v/s RDBMS before you decide... Even 100 millions rows can be handled by a relational database if it is tuned properly. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Mon, Apr

Re: Porting SQL DB into HBASE

2010-04-12 Thread Amandeep Khurana
Kranthi, Your tables seem to be small. Why do you want to port them to HBase? -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Mon, Apr 12, 2010 at 1:55 AM, kranthi reddy wrote: > HI jonathan, > > Sorry for the late response. Mi

Re: getSplits() in TableInputFormatBase

2010-04-11 Thread Amandeep Khurana
You have 1 region per table and thats why you are getting 1 split when you scan any of those tables... Moreover, the number of map tasks configuration is ignored when you are running in pseudo dist mode since the job tracker is local. Amandeep Khurana Computer Science Graduate Student

Re: getSplits() in TableInputFormatBase

2010-04-11 Thread Amandeep Khurana
3 tables? are you counting root and meta also? Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sun, Apr 11, 2010 at 1:57 AM, john smith wrote: > From the web interface... > > > number of regions =5 > number of tables = 3 > > Tha

Re: getSplits() in TableInputFormatBase

2010-04-11 Thread Amandeep Khurana
How many regions do you have? Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sun, Apr 11, 2010 at 1:39 AM, john smith wrote: > Amandeep , > > Thanks for the explanation . What is the default value to the num of maps ? > Is it not equa

Re: getSplits() in TableInputFormatBase

2010-04-11 Thread Amandeep Khurana
If you set the number of map tasks as a higher number than the number of regions (I generally set it to 10 or something like that), the number of splits = number of regions. If you keep it lower, then it combines regions in a single split. Amandeep Khurana Computer Science Graduate Student

Re: getSplits() in TableInputFormatBase

2010-04-11 Thread Amandeep Khurana
The number of splits is equal to the number of regions... On Sun, Apr 11, 2010 at 12:54 AM, john smith wrote: > Hi , > > In the method "public org.apache.hadoop.mapred.InputSplit[] *getSplits* > (org.apache.hadoop.mapred.JobConf job, > > i

Re: set number of map tasks for HBase MR

2010-04-10 Thread Amandeep Khurana
Ah.. I still use the old api for the job configuration. In the JobConf object, you can call the setNumMapTasks() function. -ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sat, Apr 10, 2010 at 8:17 PM, Andriy Kolyadenko < c

Re: set number of map tasks for HBase MR

2010-04-10 Thread Amandeep Khurana
You can set the number of map tasks in your job config to a big number (eg: 10), and the library will automatically spawn one map task per region. -ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sat, Apr 10, 2010 at 7:59 PM, Andriy Kolyadenko

Re: Using SPARQL against HBase

2010-04-10 Thread Amandeep Khurana
Create a jira to spec out the design for the RDF layer: https://issues.apache.org/jira/browse/HBASE-2433. I'll post an initial design and some other ideas on it soon. Go ahead and put in whatever you have in mind. -ak Amandeep Khurana Computer Science Graduate Student University of Calif

Re: Using SPARQL against HBase

2010-04-05 Thread Amandeep Khurana
sh the subject value to decide the table it should be placed in. Thoughts? This gives us scale as well as the ability to do fast querying.. Ofcourse, as Andy mentioned, we'll have to find a subset of queries that we will support. [1] http://ceph.newdream.net/papers/weil-crush-sc06.pdf Amand

Re: Using SPARQL against HBase

2010-04-05 Thread Amandeep Khurana
jacent list?" How did you set > > that up? > > > > > > > > -Original Message- > > From: Amandeep Khurana [mailto:ama...@gmail.com] > > Sent: Wednesday, March 31, 2010 5:42 PM > > To: hbase-user@hadoop.apache.org > > Subject: Re: Using SPARQL against HBase >

Re: Using SPARQL against HBase

2010-04-05 Thread Amandeep Khurana
that would be useful. What are you aware of Google's work with linked data and bigtable? Give us some insights there... -ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Mon, Apr 5, 2010 at 2:51 AM, Edward J. Yoon wrote: > Well, the structure

Re: Using SPARQL against HBase

2010-04-04 Thread Amandeep Khurana
that fast querying is possible. Do you have any idea on how Google stores linked data in bigtable? We can build on it from there. -ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sun, Apr 4, 2010 at 10:50 PM, Edward J. Yoon wrote: > Hi, I'

Re: Using SPARQL against HBase

2010-04-01 Thread Amandeep Khurana
ne on HBase and how HBase can be used as is or remodeled to fit the problem. Depending on what we find out, we'll decide on taking the project further and committing efforts towards it. -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On

Re: Using SPARQL against HBase

2010-03-31 Thread Amandeep Khurana
Raffi, This article might interest you: http://decentralyze.com/2010/03/09/rdf-meets-nosql/ Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Mar 31, 2010 at 8:27 AM, Basmajian, Raffi < rbasmaj...@oppenheimerfunds.com> wrote: &g

Re: Using SPARQL against HBase

2010-03-31 Thread Amandeep Khurana
I didnt do queries over triples. It was essentially a graph stored as an adjacency list and used gets and scans for all the work. Andrew, if Trend is interested too, we can make this a serious project. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On

Re: Using SPARQL against HBase

2010-03-31 Thread Amandeep Khurana
HBase has a simple API: get(), put() and scan() for the most part. There was work done on a query language recently. Its called HBQL. Here's the link: http://www.hbql.com/ -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Mar 31, 20

Re: Using SPARQL against HBase

2010-03-31 Thread Amandeep Khurana
-ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Mar 31, 2010 at 11:56 AM, Andrew Purtell wrote: > Hi Raffi, > > To read up on fundamentals I suggest Google's BigTable paper: > http://labs.google.com/papers/bigtable.html &

Re: Use cases of HBase

2010-03-09 Thread Amandeep Khurana
over HBase (there is still some steam there but not enough to take it up yet). But as Jonathan said, if you are looking at a data set of the order of 10GB, HBase isnt your best bet. -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, Ma

Re: DBInputFormat

2010-02-12 Thread Amandeep Khurana
You can find examples on how to use DBInputFormat on the internet. And if you want a sample input format, just read any of the existing ones... Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Fri, Feb 12, 2010 at 10:39 PM, Gaurav Vashishth wrote

Re: DBInputFormat

2010-02-12 Thread Amandeep Khurana
DBInputFormat splits the count() from the RDBMS table into the number of mappers. If you want to split using your own scheme, you'll have to write your own input format or tweak the existing one. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz O

Re: thinking about HUG9

2010-02-11 Thread Amandeep Khurana
+1 for March 8th evening post 6pm Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Thu, Feb 11, 2010 at 2:33 PM, Andrew Purtell wrote: > March 8 is ok -- afternoon/evening. > > - Andy > > > > > From: Stack > > Can we d

Re: MR on HDFS data inserted via HBase?

2010-01-13 Thread Amandeep Khurana
> - data that is only available in memory of the regionserver > Precisely the reason why I said its non trivial

Re: MR on HDFS data inserted via HBase?

2010-01-13 Thread Amandeep Khurana
Yes, by api I mean TableInputFormat and TableOutputFormat. Pig has a connector to HBase. Not sure if Hive has one yet. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Jan 13, 2010 at 8:28 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com>

Re: MR on HDFS data inserted via HBase?

2010-01-13 Thread Amandeep Khurana
HBase has its own file format. Reading data from it in your own job will not be trivial to write, but not impossible. Why would you want to use the underlying data files in the MR jobs? Any limitation in using the HBase api? On Wed, Jan 13, 2010 at 8:06 PM, Otis Gospodnetic < otis_gospodne...@yah

Re: unable to write in hbase using mapreduce hadoop 0.20 and hbase 0.20

2009-12-04 Thread Amandeep Khurana
Try and output the data you are parsing from the xml to stdout. Maybe its not getting any data at all? One more thing you can try is to not use vectors and see if the individual Puts are getting committed or not. Use sysouts to see whats happening in the program. The code seems correct. Amandeep

Re: HBase Index: indexed table or lucene index

2009-11-22 Thread Amandeep Khurana
e these indice in > hbase, how I import them, still column name: value? Seems like the data form > in original htable. Otherwise, if i store them in HDFS, how I use the index > to improve the search. Till now, I am not clear this mechanism can help, so > what do you think of it? > >

Re: HBase Index: indexed table or lucene index

2009-11-22 Thread Amandeep Khurana
am thinking > build some index or use other mechanism like cache to improve the query > performance. Any suggestions? > > Thanks. > > ------ > From: "Amandeep Khurana" > Sent: Monday, November 23, 2009 2:18 PM > To: &g

Re: HBase Index: indexed table or lucene index

2009-11-22 Thread Amandeep Khurana
What kind of querying do you want to do? What do you mean by query performance? Hbase has secondary indexes (IndexedTable). However, its recommended that you build your own secondary index instead of using the one provided by Hbase. Lucene is a different framework altogether. Lucene indexes are f

Re: Java client connection to HBase 0.20.1 problem

2009-11-02 Thread Amandeep Khurana
Restart the network interfaces after you edit the hosts file: sudo /etc/init.d/networking restart On Mon, Nov 2, 2009 at 3:44 AM, Amandeep Khurana wrote: > > Add the following in your /etc/hosts > > > > That might be the problem > > On Mon, Nov 2, 2009 at 3:38

Re: Java client connection to HBase 0.20.1 problem

2009-11-02 Thread Amandeep Khurana
Add the following in your /etc/hosts That might be the problem On Mon, Nov 2, 2009 at 3:38 AM, Sławek Trybus wrote: > Hi to everyone in the forum! > > I'm getting started with HBase 0.20.1 and I'm trying to prepare dev > environment for web application development based on HBase as storage. >

Re: HBase -- unique constraints

2009-10-29 Thread Amandeep Khurana
On Thu, Oct 29, 2009 at 6:33 PM, satb wrote: > > > > >> 2. How should we place constraints? For example, if I have a Users > table, > >> and one column is "login_name". How can I ensure that no two people > >> create > >> the same login name? Is there a constraint I can place or something > >> si

Re: HBase -- unique constraints

2009-10-29 Thread Amandeep Khurana
anwers inline On Thu, Oct 29, 2009 at 5:24 PM, learningtapestry wrote: > > I am evaluating if HBase is right for a small prototype I am developing. > > Please help me understand a few things: > > 1. Can I do something like this in RDBMS -- "select * from table WHERE > column LIKE '%xyz%'" > You

Re: How to run java program

2009-10-26 Thread Amandeep Khurana
Comment Inline On Mon, Oct 26, 2009 at 8:00 PM, Liu Xianglong wrote: > Thanks to 梁景明 and Tatsuya Kawano. I use HBase 0.20.0, I did follow what 梁景明 > said, write my code as > HBaseConfiguration config = new HBaseConfiguration(); > config.addResource("./conf/hbase-site.xml"); > Then run it on m

Re: Hbase can we insert such (inside) data faster?

2009-10-26 Thread Amandeep Khurana
keeper is managed by hbase on > both nodes and > quorum consists of two zookeepers per node. > Could you tell me how much Zookeepers should I have per this configuration > and how it usually should be? > BTW, which hards disks did you use? > > 2009/10/26 Amandeep Khurana > &

Re: Hbase can we insert such (inside) data faster?

2009-10-26 Thread Amandeep Khurana
This is slow.. We get about 4k inserts per second per region server with row size being about 30kB. Using Vmware could be causing the slow down. Amandeep On Mon, Oct 26, 2009 at 2:04 AM, Dmitriy Lyfar wrote: > Hello, > > We are using hadoop + hbase (0.20.1) for tests now. Machines we are testin

Re: Store Large files/images HBase

2009-10-19 Thread Amandeep Khurana
comments inline Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Mon, Oct 19, 2009 at 6:58 AM, Fred Zappert wrote: > Does anyone want to pick up on this? > > -- Forwarded message -- > From: Luis Carlos Junges > Da

Re: Capacity planning

2009-10-16 Thread Amandeep Khurana
16, 2009 at 11:16 AM, Amandeep Khurana > wrote: > > > We've got similar insert speed too.. We have bigger rows - about 30k > > each. And get around 3-4k inserts/second using MR jobs with some > > tuning... > > Quad core/8GB x 9 nodes... > > > > On 10/1

Re: Hbase error

2009-10-16 Thread Amandeep Khurana
Is your zk up? What are you getting in the logs there? Also, post DEBUG logs from your master while you are doing this... Amandeep PS: Use www.pastebin.com to post logs on the mailing list. Its much cleaner and easier. On Wed, Oct 14, 2009 at 9:07 AM, Ananth T. Sarathy < ananth.t.sara...@gmail.c

Re: Capacity planning

2009-10-16 Thread Amandeep Khurana
ks, >> >> Fred. >> > > > > -- > http://www.drawntoscaleconsulting.com - Scalability, Hadoop, HBase, > and Distributed Lucene Consulting > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: Question about MapReduce

2009-10-15 Thread Amandeep Khurana
On Thu, Oct 15, 2009 at 5:31 PM, Something Something wrote: > Kevin - Interesting way to solve the problem, but I don't think this > solution is bullet-proof. While the MapReduce is running, someone may > modify the "flag" and that will completely change the outcome - unless of > course there's

Re: Question about MapReduce

2009-10-15 Thread Amandeep Khurana
Comments inline On Thu, Oct 15, 2009 at 2:20 PM, Something Something wrote: > I have 3 HTables Table1, Table2 & Table3. > I have 3 different flat files. One contains keys for Table1, 2nd contains > keys for Table2 & 3rd contains keys for Table3. > > Use case: For every combination of thes

Re: Hbase Installation

2009-10-13 Thread Amandeep Khurana
Comments inline On Tue, Oct 13, 2009 at 1:04 AM, kranthi reddy wrote: > Hi all, > > I am new to Hbase and am having some troubles in understanding how it > works. > > I am trying to setup Hbase in distributed mode with 2 machines in my > cluster. I have hadoop up and running on these 2 machines.

Re: SessionExpiredException: KeeperErrorCode = Session expired for /hbase

2009-10-11 Thread Amandeep Khurana
This shouldnt be causing your jobs to fail... These are just session expiry warnings. New nodes will get created in zk and the jobs shouldnt be interrupted. On Sun, Oct 11, 2009 at 10:50 PM, Something Something < luckyguy2...@yahoo.com> wrote: > Occasionally, I see the following exception in the

Re: NotServingRegionException on .META. table?!

2009-10-08 Thread Amandeep Khurana
Can you change the logging level to DEBUG and post the logs again.. That'll give a better idea on whats happening. You dont need to attach the logs. Use pastebin or pastie.. Those are convenient. -ak Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: how to handle large volume reduce input value in mapreduce program?

2009-10-08 Thread Amandeep Khurana
with it in the reducer. Also.. increase the number of reducers. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, Sep 29, 2009 at 1:45 AM, wrote: > Hi, all > > > > I am a newbie to hadoop and just begin to play it recent days. I

Re: can TableInputFormat take array of tables as fields

2009-10-06 Thread Amandeep Khurana
guess thats not what you are looking for in this case. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, Oct 6, 2009 at 11:33 AM, Huang Qian wrote: > Thank you for the help. > > The data is distribute on different htables, all of which

Re: can TableInputFormat take array of tables as fields

2009-10-06 Thread Amandeep Khurana
data I give to them. > how can I make it without create a new inputformat implement > tableinputformat? > > 2009/10/5 Amandeep Khurana > > > Afaik, you can scan across only a single table. However, you can read > > other tables using the api within the map or reduce ta

Re: can TableInputFormat take array of tables as fields

2009-10-05 Thread Amandeep Khurana
les as inputs? > > -- > Huang Qian(黄骞) > Institute of Remote Sensing and GIS,Peking University > Phone: (86-10) 5276-3109 > Mobile: (86) 1590-126-8883 > Address:Rm.554,Building 1,ChangChunXinYuan,Peking > Univ.,Beijing(100871),CHINA > -- Amandeep Khurana Computer Scien

Re: table design suggestions...

2009-10-01 Thread Amandeep Khurana
On Tue, Sep 29, 2009 at 11:28 PM, Sujee Maniyam wrote: > > You can either create 2 tables. One can have the user as the key and the > > other can have the country as the key.. > > > > Or.. you can create a single table with user+country as the key. > > > > Third way is to have only one table with

Re: table design suggestions...

2009-09-29 Thread Amandeep Khurana
comments inline Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, Sep 29, 2009 at 3:10 PM, Sujee Maniyam wrote: > HI all, > I am in the process of migrating a relational table to Hbase. > > Current table: records user access logs &

Re: Are you using the Region Historian? Read this

2009-09-18 Thread Amandeep Khurana
+1.. I typically end up looking into the logs as well. Its a good feature to have though. On Fri, Sep 18, 2009 at 1:51 PM, Jean-Daniel Cryans wrote: > I opened HBASE-1854 about removing the historian and it will be > applied to 0.20.1 and 0.21.0 then I'll open another one for the > migration code

Re: Mapreduce dameons?

2009-09-08 Thread Amandeep Khurana
Where do you store your hbase data? Arent you using Hadoop and HDFS? If you want to run MR jobs over data stored in HBase, you would need a Hadoop instance... Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, Sep 8, 2009 at 12:52 PM, Keith Thomas

Re: skipping the map input value and key?!!

2009-09-04 Thread Amandeep Khurana
Comments inline Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Fri, Sep 4, 2009 at 5:46 AM, Xine Jar wrote: > Hallo, > I have a mapreduce application reading from an existing hbase table. The > map > function searches for some values

Re: Cassandra vs HBase

2009-09-01 Thread Amandeep Khurana
They do target the same problem but afaik, the design is different. Cassandra is a P2P based system whereas HBase has the concept of a Master node. HBase is on the same lines as BigTable. You can read the paper to get a better idea about the design phisolophy... On Tue, Sep 1, 2009 at 11:45 AM, ch

Re: Save data to HBase

2009-09-01 Thread Amandeep Khurana
What do you mean by quick way? You can use the api to put data into it through a standalone java program or you can use a MR job to do it.. On Tue, Sep 1, 2009 at 2:48 AM, wrote: > Hi there, > > Any quick way that I can save my data(12G) to HBase? > Thanks > > Fleming > > -

Re: Turn of Zookeeper

2009-08-25 Thread Amandeep Khurana
ZK is essential to get hbase going.. Dont think there's a way to avoid that. However, writing unit tests for your code might be a good idea.. See the existing test cases to get an idea of how that can be done. Amandeep Khurana Computer Science Graduate Student University of California,

Re: Location of HBase's database (database' s files) on the hard disk

2009-08-24 Thread Amandeep Khurana
Data in hbase tables resides in the memory till it gets flushed to persistent storage. Anything that was in the memory would probably be lost unless you are using the write ahead logs (which you would be using by default in 0.20 if you dont disable it in the puts). Amandeep Khurana Computer

Re: Location of HBase's database (database' s files) on the hard disk

2009-08-21 Thread Amandeep Khurana
see whats happening. Post it here if you cant figure it out. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Fri, Aug 21, 2009 at 1:30 AM, Nguyen Thi Ngoc Huong wrote: > >You dont need to format the namenode everytime.. Just bin/start-all.sh >

Re: HBase-0.20.0 multi read

2009-08-21 Thread Amandeep Khurana
ith HBase? In any case, you would need about 8-9 nodes to have a stable setup. > > Fleming > > > > > > Amandeep Khurana > hbase-user@hadoop.apache.org > >cc: (bcc: Y_82391

Re: Location of HBase's database (database' s files) on the hard disk

2009-08-21 Thread Amandeep Khurana
er, when I restart my computer, I must restart hadoop (by command > ./bin/hadoop format namenode and ./bin/start all) , restart hbase, and my > database is lost. What can I do to save my database? > You dont need to format the namenode everytime.. Just bin/start-all.sh > > 2009/8/21

Re: HBase-0.20.0 multi read

2009-08-21 Thread Amandeep Khurana
current connections (at the socket level) that a >single client, identified by IP address, may make to a single member of >the ZooKeeper ensemble. Set high to avoid zk connection issues running >standalone and pseudo-distributed. > > > > > > > > &g

Re: Location of HBase's database (database' s files) on the hard disk

2009-08-21 Thread Amandeep Khurana
ster.java:210) >at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:156) >at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:96) >at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:78) >at org.apache.hadoop.hbase.master.HM

Re: Doubt in HBase

2009-08-20 Thread Amandeep Khurana
h >> > spans across 5 region servers . I am using TableInputFormat to read the >> > data >> > from the tables in the map . When i run the program , by default how >> > many >> > map regions are created ? Is it one per region server or more ? >> > >> > Also after the map task is over.. reduce task is taking a bit more time >> > . >> > Is >> > it due to moving the map output across the regionservers? i.e, moving >> > the >> > values of same key to a particular reduce phase to start the reducer? Is >> > there any way i can optimize the code (e.g. by storing data of same >> reducer >> > nearby ) >> > >> > Thanks :) >> > >> > >> > >> > >> > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: HBase-0.20.0 multi read

2009-08-20 Thread Amandeep Khurana
(and any attachments) is proprietary information > for the sole use of its > intended recipient. Any unauthorized review, use or distribution by anyone > other than the intended > recipient is strictly prohibited. If you are not the intended recipient, > please notify the send

Re: HBase-0.20.0 multi read

2009-08-20 Thread Amandeep Khurana
use or distribution by anyone > other than the intended > recipient is strictly prohibited. If you are not the intended recipient, > please notify the sender by > replying to this email, and then delete this email and any copies of it > immediately. Thank you. > --- > > > > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: Location of HBase's database (database' s files) on the hard disk

2009-08-20 Thread Amandeep Khurana
t; is lost. Is it always true? Or did I configure wrong? How can i configure > Hbase to save database after restart computer? > > -- > Nguyễn Thị Ngọc Hương > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: Doubt in HBase

2009-08-20 Thread Amandeep Khurana
On Thu, Aug 20, 2009 at 9:42 AM, john smith wrote: > Hi all , > > I have one small doubt . Kindly answer it even if it sounds silly. > No questions are silly.. Dont worry > > Iam using Map Reduce in HBase in distributed mode . I have a table which > spans across 5 region servers . I am using

Re: data loss with hbase 0.19.3

2009-08-15 Thread Amandeep Khurana
> simply cannot sync its logs to it. But, we did some work to make >> > >>> that >> > >>> story better. The latest revision in the 0.19 branch and 0.20 RC1 >> both >> > >>> solve much of the data loss problem but it won't be near perfect >> until >> > >>> we have appends (supposed to be available in 0.21). >> > >>> >> > >>> J-D >> > >>> >> > >>> On Thu, Aug 6, 2009 at 12:45 AM, Chen Xinli >> > wrote: >> > >>> > Hi, >> > >>> > >> > >>> > I'm using hbase 0.19.3 on a cluster with 30 machines to store web >> > data. >> > >>> > We got a poweroff days before and I found much web data lost. I >> have >> > >>> > searched google, and find it's a meta flush problem. >> > >>> > >> > >>> > I know there is much performance improvement in 0.20.0; Is the >> > >>> > data >> > lost >> > >>> > problem handled in the new version? >> > >>> > >> > >>> > -- >> > >>> > Best Regards, >> > >>> > Chen Xinli >> > >>> > >> > >>> >> > >> >> > >> >> > >> >> > >> -- >> > >> Best Regards, >> > >> Chen Xinli >> > >> >> > > >> > > >> > > >> > > -- >> > > Best Regards, >> > > Chen Xinli >> > > >> > >> >> >> >> -- >> Best Regards, >> Chen Xinli >> > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: anyone with experience in hbase 0.20 realtime application?

2009-08-12 Thread Amandeep Khurana
t and that makes reads much faster. -amandeep On 8/11/09, Vaibhav Puranik wrote: > Amandeep, > > We are caching Hbase results in memory (in a HashMap). > > Regards, > Vaibhav > > On Tue, Aug 11, 2009 at 12:56 PM, Amandeep Khurana wrote: > >> Vaibhav, >> >

Re: anyone with experience in hbase 0.20 realtime application?

2009-08-11 Thread Amandeep Khurana
Vaibhav, What kind of caching are you doing over hbase and how? -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, Aug 11, 2009 at 10:48 AM, Vaibhav Puranik wrote: > We are using HBase 0.20 (Trunk version at 23rd July evening)

Re: anyone with experience in hbase 0.20 realtime application?

2009-08-11 Thread Amandeep Khurana
I'm working on an app that will be feeding off hbase in real time. However, its not in production yet. The front end/service later is yet to be completed and tested. The initial testing of a prototype did look promising though. On Tue, Aug 11, 2009 at 10:29 AM, Fabio Kaminski wrote: > is there a

Re: Region servers down when inserting with hbase0.20.0 rc

2009-08-06 Thread Amandeep Khurana
op.hbase.client.HTable.flushCommits(HTable.java:584) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:450) > at hbasetest.HBaseWebpage.insert(HBaseWebpage.java:82) > at hbasetest.InsertThread.run(InsertThread.java:26) > . > . > . > . > . > . > . > > > > Any suggestion? > Thanks a lot, > LvZheng > > 2009/8/5 Zheng Lv > >> Hi Stack, >> Thank you very much for your explaination. >> We just adjusted the value of the property "zookeeper.session.timeout" >> to 12, and we are observing the system now. >> "Are nodes running on same nodes as hbase? " --Do you mean we should >> have several servers running exclusively for zk cluster? But I'm afraid >> that >> we can not have that many servers. Any suggestion? >> We don't config the zk in zoo.cfg, but in hbase-site.xml. Following is >> the content in hbase-site.xml about zk. >> >> hbase.zookeeper.property.clientPort >> >> >> >> >> hbase.zookeeper.quorum >> ubuntu2,ubuntu3,ubuntu7,ubuntu9,ubuntu6 >> >> >> >> zookeeper.session.timeout >> 12 >> >> >> Thanks a lot, >> LvZheng >> >> > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: Some confusions about What HBase is and When to use it?

2009-08-04 Thread Amandeep Khurana
bles etcetera. You'd want to not run any map reduce over >> it at the same time. >> >> > Or >> > Its main goal is to analyse whole data and calculations for the internal >> > use? Not for serving them to users in realtime like RDBMS? >> >> HBase 0.20 can handle real time now. See >> http://devblog.streamy.com/2009/07/24/streamy-hadoop-summit-hbase-goes-realtime/ >> >> >> > >> > Thank you so much. >> >> >> hope that helps. >> ~Tim > > _ > Windows Live tüm arkadaşlarınızla tek bir yerden iletişim kurmanıza yardımcı > olur. > http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: Problem with TableInputFormat - HBase 0.20

2009-08-03 Thread Amandeep Khurana
The implementation in the new package is different from the old one. So, if you want to use it in the same way as you used to use the old one, you'll have to stick to the mapred package till the time you upgrade the code according to the new implementation. On Mon, Aug 3, 2009 at 3:45 PM, Lucas N

Re: Build 0.2 from svn checkout (NoClassDefFoundError: javax/servlet/jsp/JspFactory)

2009-07-25 Thread Amandeep Khurana
Trunk has a lot of new stuff after alpha and is much more stable. I just downloaded the trunk and built it without trouble. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sat, Jul 25, 2009 at 10:25 AM, Saptarshi Guha wrote: > Hello, > I

Re: Data Processing in hbase

2009-07-22 Thread Amandeep Khurana
gt; I'd recommend you read up the papers on MR, BigTable, and some of the latest stuff on HadoopDB etc. That'll give you clarity. > > On Wed, Jul 22, 2009 at 12:34 PM, Amandeep Khurana > wrote: > > > On Wed, Jul 22, 2009 at 12:01 AM, bharath vissapragada

Re: Data Processing in hbase

2009-07-22 Thread Amandeep Khurana
other machines ..? Ofcourse. Network and I/O overheads definitely plague processing large datasets. > > > On Wed, Jul 22, 2009 at 12:24 PM, Amandeep Khurana > wrote: > > > HBase is meant to store large tables. The intention is to store data in a > > way thats more scalable a

Re: Data Processing in hbase

2009-07-21 Thread Amandeep Khurana
gt; On Wed, Jul 22, 2009 at 12:12 PM, Amandeep Khurana > wrote: > > > Yes.. Only if you use MR. If you are writing your own code, it'll pull > the > > records to the place where you run the code. > > > > On Tue, Jul 21, 2009 at 11:39 PM, Fernando Padilla >

Re: Data Processing in hbase

2009-07-21 Thread Amandeep Khurana
(like java)? > > > > On 7/21/09 9:49 PM, Amandeep Khurana wrote: > >> Bharath, >> >> The processing is done as local to the RS as possible. The first attempt >> is >> at doing it local on the same node. If thats not possible, its done on the >> same rac

Re: Data Processing in hbase

2009-07-21 Thread Amandeep Khurana
Bharath, The processing is done as local to the RS as possible. The first attempt is at doing it local on the same node. If thats not possible, its done on the same rack. -ak On Tue, Jul 21, 2009 at 9:43 PM, bharath vissapragada < bhara...@students.iiit.ac.in> wrote: > Hi all, > > I have one s

Re: Join in HBase

2009-07-14 Thread Amandeep Khurana
ng. >> >> On Tue, Jul 14, 2009 at 9:56 PM, bharath >> vissapragada wrote: >> > Hi all , >> > >> > I want to join(similar to relational databases join) two tables in HBase >> . >> > Can anyone tell me whether it is already implemented in

Re: region offline

2009-07-08 Thread Amandeep Khurana
he intended > recipient is strictly prohibited. If you are not the intended recipient, > please notify the sender by > replying to this email, and then delete this email and any copies of it > immediately. Thank you. > ------

Accessing a 0.20 cluster from outside a firewall

2009-06-22 Thread Amandeep Khurana
firewall. Shoudlnt hbase/zk (not sure where this trouble is) be giving back the DNS name rather than the ip address? Any pointers on this? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

HBase v0.19.3 with Hadoop v0.19.1?

2009-06-04 Thread Amandeep Khurana
I have a couple of questions: 1. Is Hbase 0.19.3 release stable for a production cluster? 2. Can it be deployed over Hadoop v0.19.1? ..amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz

Re: HBase Data Model

2009-05-13 Thread Amandeep Khurana
hbase might be optimal for (debatable - i dont have exact numbers). Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, May 13, 2009 at 1:43 PM, Amandeep Khurana wrote: > I store an entire graph in a single hbase table. This has essentially >

Re: HBase Data Model

2009-05-13 Thread Amandeep Khurana
it like a giant sparse matrix of some sort (not exactly though). Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, May 13, 2009 at 12:24 PM, llpind wrote: > > Thanks. Thats cool, I'm interested in indexes. Here is a classic > student/c

Re: Hadoop Summit 2009 - Open for registration

2009-05-12 Thread Amandeep Khurana
It shows sold out on the website. Any chances of more seats opening up? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Tue, May 5, 2009 at 2:10 PM, Ajay Anand wrote: > This year's Hadoop Summit > (http://developer.yahoo

Re: Region Servers going down frequently

2009-04-09 Thread Amandeep Khurana
When you say column foo: it basically picks up all the columns under the family foo:.. You dont have to give individual column names. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Thu, Apr 9, 2009 at 12:25 AM, Rakhi Khatwani wrote: > Tha

Re: Region Servers going down frequently

2009-04-09 Thread Amandeep Khurana
You can use this... https://issues.apache.org/jira/browse/HBASE-897 Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Apr 8, 2009 at 11:59 PM, Rakhi Khatwani wrote: > Hi Andy, > > I want to back up my HBase and move to a more powerful m

Re: Region Servers going down frequently

2009-04-08 Thread Amandeep Khurana
When is 0.20 release expected? Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Apr 8, 2009 at 12:37 AM, Ryan Rawson wrote: > Just FYI, 0.20 handles small cell values substantially better than 0.19.1. > > -ryan > > On Wed, Apr 8

Re: Region Servers going down frequently

2009-04-08 Thread Amandeep Khurana
reeze... Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Apr 8, 2009 at 12:29 AM, Rakhi Khatwani wrote: > Thanks, Amandeep > > One more question, i have mailed it earlier and i have attached the > snapshot > along with that email. >

  1   2   >