Hadoop BootCamp in Berlin Aug 27, 28th (reminder)

2009-08-03 Thread Chris K Wensel


Hi all,

A quick reminder that Scale Unlimited will run a 2 day Hadoop BootCamp  
in Berlin on August 27th and 28th.


This 2 day course is for managers and developers who want to quickly  
become experienced with Hadoop and related technologies.


The BootCamp provides training in MapReduce Theory, Hadoop  
Architecture, configuration, and API's through our hands-on labs.


All our courses are taught by practitioners with years of Hadoop and  
related experience in large data architectures.


** Professional independent consultants may take this course for free,  
please email i...@scaleunlimited.com to inquire.

http://www.scaleunlimited.com/courses/programs

Detailed information and registration information is at:

  http://www.scaleunlimited.com/courses/berlin08 (german) or
  http://www.scaleunlimited.com/courses/hadoop-boot-camp-berlin-en  
(english)


cheers,
chris

P.S Apologies for the cross posting.
P.P.S. Please spread the word!

~~~
Hadoop training and consulting
http://www.scaleunlimited.com


Re: Problem with TableInputFormat - HBase 0.20

2009-08-03 Thread Amandeep Khurana
The implementation in the new package is different from the old one. So, if
you want to use it in the same way as you used to use the old one, you'll
have to stick to the mapred package till the time you upgrade the code
according to the new implementation.


On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos <
nazario.lu...@gmail.com> wrote:

> Thanks. But I didn't get it. Why should I stick with the old mapred package
> if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old mapred
> package are all deprecated.
>
>
>
> On Mon, Aug 3, 2009 at 7:31 PM, stack  wrote:
>
> > Looks like crossed lines.
> >
> > In hadoop 0.20.0, there is the mapred package and the mapreduce package.
> > The latter has the new lump-sum context to which you go for all things.
> > HBase has similar.  The new mapreduce package that is in 0.20.0 hbase is
> > the
> > old mapred redone to fit the new hadoop APIs.  Below in your stacktrace I
> > see use of the new hbase mapreduce stuff though you would hone to the old
> > interface.  Try using the stuff in mapred package?
> >
> > St.Ack
> >
> >
> > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> > nazario.lu...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> > regarding
> > > the TableInputFormat class. Bellow is how I'm setting up the job and
> also
> > > the error message I'm getting.
> > >
> > > Does anybody have a clue on what may be happening? It used to work on
> > HBase
> > > 0.19.
> > >
> > > Lucas
> > >
> > >
> > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > > this.configuration.set(TableInputFormat.SCAN, "date");
> > > this.configuration.set("index.name", args[1]);
> > > this.configuration.set("hbase.master", args[2]);
> > > this.configuration.set("index.replication.level", args[3]);
> > >
> > > final Job jobConf = new Job(this.configuration);
> > > jobConf.setJarByClass(Indexer.class);
> > > jobConf.setJobName("NInvestNewsIndexer");
> > >
> > > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> > >
> > > jobConf.setInputFormatClass(TableInputFormat.class);
> > > jobConf.setOutputFormatClass(NullOutputFormat.class);
> > >
> > > jobConf.setOutputKeyClass(Text.class);
> > > jobConf.setOutputValueClass(Text.class);
> > >
> > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> > >
> > >
> > >
> > >
> > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
> > > java.io.EOFException
> > >at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >at
> > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> > >at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> > >at
> > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> > >at
> > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> > >at
> > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > >at
> > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > >at
> > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> > >at
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > >at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > >at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > >at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > >at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >at
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >at java.lang.reflect.Method.invoke(Method.java:597)
> > >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > Exception in thread "main" java.lang.NullPointerException
> > >at
> > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> > >at
> > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> > >at
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > >at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > >at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > >at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > >at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >at com.nash.ninvest.i

Re: stargate performance evaluation

2009-08-03 Thread Andrew Purtell
Hi,

Thanks for the testing and performance report!

You said you used the stargate Client package? It is pretty basic, written 
mainly for convenience for writing test cases in the test suite. 

Regarding Stargate quality in general, this is an alpha release. It can survive 
torture testing with PE it seems. It can handle well formed requests. But, the 
implementation is untuned. For example, there is no caching (yet). The code has 
not yet been profiled also. 

I put up an issue for Stargate performance improvement: 
https://issues.apache.org/jira/browse/HBASE-1741

I'm not sure an all-localhost configuration is the best testing scenario. It 
would be interesting to see how the performance differs with the client remote 
from both the regionservers and the Stargate instance. 

  - Andy






From: Haijun Cao 
To: hbase-user@hadoop.apache.org
Sent: Monday, August 3, 2009 2:04:16 PM
Subject: stargate performance evaluation

I am evaluating the performance of stargate (which btw, is a
great contrib to hbase, thanks!). The evaluation program is mostly a simple
modification to the existing PerformanceEvaluation program, just replace java
client with stargate client and get value as protobuf. 

All of the software (hadoop, zookeeper, hbase, jetty) are
installed on one box. The data set is small, therefore all data are served out
of memory.

For random read test, with java client (the existing PE
program), I can get 19K/s, with stargate client,  I can only get 3-4k/s.
In both case, pe program run with 100 threads. Increasing number of threads
does not seem to help (even hurt the throughput).

I am just wondering if this is expected ( I can’t figure out
in theory why the throughput drop)? Any idea of possible 
optimization/configuration change to increase the throughput?

Thanks!

Haijun Cao


  

Re: stargate performance evaluation

2009-08-03 Thread Haijun Cao
Andrew,

Thanks for the reply. I am considering using stargate in one of my projects, 
the design/impl is quite elegant. In your opinion, is there any hard limitation 
preventing stargate achieving the same throughput as that of hbase java client? 
 Is it just a matter of fine tuning? I am not sure if caching help in case of 
random read. I agree that the all local setup is naive, will do a more 
realistic test and share the observation.  

Haijun





From: Andrew Purtell 
To: hbase-user@hadoop.apache.org
Sent: Monday, August 3, 2009 5:25:09 PM
Subject: Re: stargate performance evaluation

Hi,

Thanks for the testing and performance report!

You said you used the stargate Client package? It is pretty basic, written 
mainly for convenience for writing test cases in the test suite. 

Regarding Stargate quality in general, this is an alpha release. It can survive 
torture testing with PE it seems. It can handle well formed requests. But, the 
implementation is untuned. For example, there is no caching (yet). The code has 
not yet been profiled also. 

I put up an issue for Stargate performance improvement: 
https://issues.apache.org/jira/browse/HBASE-1741

I'm not sure an all-localhost configuration is the best testing scenario. It 
would be interesting to see how the performance differs with the client remote 
from both the regionservers and the Stargate instance. 

  - Andy






From: Haijun Cao 
To: hbase-user@hadoop.apache.org
Sent: Monday, August 3, 2009 2:04:16 PM
Subject: stargate performance evaluation

I am evaluating the performance of stargate (which btw, is a
great contrib to hbase, thanks!). The evaluation program is mostly a simple
modification to the existing PerformanceEvaluation program, just replace java
client with stargate client and get value as protobuf. 

All of the software (hadoop, zookeeper, hbase, jetty) are
installed on one box. The data set is small, therefore all data are served out
of memory.

For random read test, with java client (the existing PE
program), I can get 19K/s, with stargate client,  I can only get 3-4k/s.
In both case, pe program run with 100 threads. Increasing number of threads
does not seem to help (even hurt the throughput).

I am just wondering if this is expected ( I can’t figure out
in theory why the throughput drop)? Any idea of possible 
optimization/configuration change to increase the throughput?

Thanks!

Haijun Cao


  

HBase 0.20.0rc not close the connection to zookeeper explicitly when closing HTable (and HBaseAdmin)

2009-08-03 Thread Angus He
Hi All,

In HBase 0.20rc, HTable does not explicitly close the connection to
zookeeper in HTable::close.
It probably could be better.  And in my opinion, it should be for:

1. It is not well-behaved, although zookeeper is able to detect the
lost connection after issuing networking I/O operation, .
2. It is easy to get zookeeper server stuck with exceptions like "Too
many connections from /0:0:0:0:0 :0:0:1 - max is 30", when user
write codes like:
for (int i = 0; i < 1024; ++i) {
HTable table = new HTable("foobar");
table.close();
}

In the current implementation, different HTable instances share the
same connection to zookeeper if they have same HBaseConfiguration
instance. For this, we cannot close the connection directly in HTable,
but probably we could implement HConnection class with
reference-counting ability.

Any comments?

-- 
Regards
Angus


Re: HBase 0.20.0rc not close the connection to zookeeper explicitly when closing HTable (and HBaseAdmin)

2009-08-03 Thread Ryan Rawson
We should move the clients to a non-active server API, possibly the
REST one, and avoid using active sessions just for clients.  Something
to address in 0.21 I think.

As for #2, it is recommended now to run a quorum of zookeeper instead
of a single one.  This reduces the risk of running out of connections.

Also the code snippet you listed is a little degenerate, we can never
fully protect ourselves from fork-bomb like code.  Your code snippet
suggests that:
- you are creating/closing HTable a lot.  Maybe you shouldn't do that?
 HTablePool?
- you have 1024+ tables, and need to access them in one client at one time.

In the mean time, highly consider upgrading to a cluster of 5-7 ZK
hosts.  For production, you should consider NOT running them on your
HBase/HDFS/map-reduce nodes.

Good luck!
-ryan

On Mon, Aug 3, 2009 at 11:00 PM, Angus He wrote:
> Hi All,
>
> In HBase 0.20rc, HTable does not explicitly close the connection to
> zookeeper in HTable::close.
> It probably could be better.  And in my opinion, it should be for:
>
> 1. It is not well-behaved, although zookeeper is able to detect the
> lost connection after issuing networking I/O operation, .
> 2. It is easy to get zookeeper server stuck with exceptions like "Too
> many connections from /0:0:0:0:0     :0:0:1 - max is 30", when user
> write codes like:
>                        for (int i = 0; i < 1024; ++i) {
>                                HTable table = new HTable("foobar");
>                                table.close();
>                        }
>
> In the current implementation, different HTable instances share the
> same connection to zookeeper if they have same HBaseConfiguration
> instance. For this, we cannot close the connection directly in HTable,
> but probably we could implement HConnection class with
> reference-counting ability.
>
> Any comments?
>
> --
> Regards
> Angus
>


Scanner Performance Question / Observation

2009-08-03 Thread Kyle Oba
Hey folks,

I have a question (possibly naive) about a scans performance.

My scan is taking about 6 seconds to do the following:

470 rows extracted
total size of all rows together is about 1.4 megs

I'm using a InclusiveStopRow filter to limit rows which are being
extracted by the scanner.  The problem for me is that I will soon be
scanning 30x this amount of data soon, and 6x30 seconds will be quite
a hit.

We're already planning to jump to HBase v0.20 soon.  But, wanted to
see if folks thought the 6 second timing on this scan was reasonable,
or horribly off.

Thanks.

Kyle


Re: Scanner Performance Question / Observation

2009-08-03 Thread Ryan Rawson
Have a look at htable.setScannerCaching, and please, please upgrade to 0.20
post haste. The only answer to 0.19 perf problems is "upgrade to 0.20"

On Aug 3, 2009 12:02 AM, "Kyle Oba"  wrote:

Hey folks,

I have a question (possibly naive) about a scans performance.

My scan is taking about 6 seconds to do the following:

470 rows extracted
total size of all rows together is about 1.4 megs

I'm using a InclusiveStopRow filter to limit rows which are being
extracted by the scanner.  The problem for me is that I will soon be
scanning 30x this amount of data soon, and 6x30 seconds will be quite
a hit.

We're already planning to jump to HBase v0.20 soon.  But, wanted to
see if folks thought the 6 second timing on this scan was reasonable,
or horribly off.

Thanks.

Kyle


RE: Connection failure to HBase

2009-08-03 Thread Onur AKTAS

I have changed hbase-site.xml as below, and it now works (in Local mode). Its 
something about Hadoop maybe?


  
hbase.master
localhost:6
The directory shared by region servers.

  
  
hbase.regionserver
localhost:60020
  



> Date: Sun, 2 Aug 2009 19:25:16 -0700
> Subject: Re: Connection failure to HBase
> From: vpura...@gmail.com
> To: hbase-user@hadoop.apache.org
> 
> You can set hbase.master property on the configuration object:
> 
> config.set("hbase.master", "localhost:9000");
> 
> Regards,
> Vaibhav
> 
> 2009/8/2 Onur AKTAS 
> 
> >
> > Hi,
> >
> > I have just installed Hadoop 19.3 (pseudo distributed mode) and Hbase 19.2
> > by following the instructions.
> > Both of them starts fine.
> >
> > Hadoop Log:
> > $ bin/start-all.sh
> > starting namenode, logging to
> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
> > localhost: starting datanode, logging to
> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
> > 
> >
> > HBase Log:
> > $ bin/start-hbase.sh
> > starting master, logging to
> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-master-localhost.localdomain.out
> > localhost:
> > starting regionserver, logging to
> >
> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-regionserver-localhost.localdomain.out
> >
> > When I try to connect HBase from a client, it gives an error as:
> >
> > Aug 3, 2009 3:35:04 AM org.apache.hadoop.hbase.ipc.HBaseClient$Connection
> > handleConnectionFailure
> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already tried
> > 0 time
> >  (s).
> > Aug 3, 2009 3:35:05 AM org.apache.hadoop.hbase.ipc.HBaseClient$Connection
> > handleConnectionFailure
> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already tried
> > 1 time(s).
> >
> > I
> > have configured sites.xml etc as "localhost:9000", How can I change
> > that 6 port in client? I use like below in my Java class.
> > HBaseConfiguration config = new HBaseConfiguration();
> >
> > Thanks.
> >
> > _
> > Teker teker mi, yoksa hepsi birden mi? Arkadaşlarınızla ilgili güncel
> > bilgileri tek bir yerden edinin.
> >
> > http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx

_
Windows Live ile fotoğraflarınızı organize edebilir, düzenleyebilir ve 
paylaşabilirsiniz.
http://www.microsoft.com/turkiye/windows/windowslive/products/photo-gallery-edit.aspx

Version Problems of hadoop and Hbase

2009-08-03 Thread bharath vissapragada
Hi all,

I have hbase 0.19.3 version ... On what versions of hadoop apart frm 0.19.x
can i run this hbase version ...

is it 0.20.x or 0.18.x ..

Thanks in advance


Re: Version Problems of hadoop and Hbase

2009-08-03 Thread tim robertson
I believe only that Hadoop version.
Cheers,
Tim

According to the docs
  
http://hadoop.apache.org/hbase/docs/r0.19.3/api/overview-summary.html#overview_description

  Requirements
   - Java 1.6.x, preferably from Sun.
   - Hadoop 0.19.x. This version of HBase will only run on this
version of Hadoop.
  ...





On Mon, Aug 3, 2009 at 1:49 PM, bharath
vissapragada wrote:
> Hi all,
>
> I have hbase 0.19.3 version ... On what versions of hadoop apart frm 0.19.x
> can i run this hbase version ...
>
> is it 0.20.x or 0.18.x ..
>
> Thanks in advance
>


RE: Version Problems of hadoop and Hbase

2009-08-03 Thread Onur AKTAS

Some people talks about HBase 0.20 (improved performance etc.). Is it available 
to download? If yes, where can I download it?

Thanks.

> Date: Mon, 3 Aug 2009 13:54:29 +0200
> Subject: Re: Version Problems of hadoop and Hbase
> From: timrobertson...@gmail.com
> To: hbase-user@hadoop.apache.org
> 
> I believe only that Hadoop version.
> Cheers,
> Tim
> 
> According to the docs
>   
> http://hadoop.apache.org/hbase/docs/r0.19.3/api/overview-summary.html#overview_description
> 
>   Requirements
>- Java 1.6.x, preferably from Sun.
>- Hadoop 0.19.x. This version of HBase will only run on this
> version of Hadoop.
>   ...
> 
> 
> 
> 
> 
> On Mon, Aug 3, 2009 at 1:49 PM, bharath
> vissapragada wrote:
> > Hi all,
> >
> > I have hbase 0.19.3 version ... On what versions of hadoop apart frm 0.19.x
> > can i run this hbase version ...
> >
> > is it 0.20.x or 0.18.x ..
> >
> > Thanks in advance
> >

_
Sadece e-posta iletilerinden daha fazlası: Diğer Windows Live™ özelliklerine 
göz atın.
http://www.microsoft.com/turkiye/windows/windowslive/

Re: Version Problems of hadoop and Hbase

2009-08-03 Thread bharath vissapragada
http://people.apache.org/~stack/hbase-0.20.0-candidate-1/

release candidate-1

2009/8/3 Onur AKTAS 

>
> Some people talks about HBase 0.20 (improved performance etc.). Is it
> available to download? If yes, where can I download it?
>
> Thanks.
>
> > Date: Mon, 3 Aug 2009 13:54:29 +0200
> > Subject: Re: Version Problems of hadoop and Hbase
> > From: timrobertson...@gmail.com
> > To: hbase-user@hadoop.apache.org
> >
> > I believe only that Hadoop version.
> > Cheers,
> > Tim
> >
> > According to the docs
> >
> http://hadoop.apache.org/hbase/docs/r0.19.3/api/overview-summary.html#overview_description
> >
> >   Requirements
> >- Java 1.6.x, preferably from Sun.
> >- Hadoop 0.19.x. This version of HBase will only run on this
> > version of Hadoop.
> >   ...
> >
> >
> >
> >
> >
> > On Mon, Aug 3, 2009 at 1:49 PM, bharath
> > vissapragada wrote:
> > > Hi all,
> > >
> > > I have hbase 0.19.3 version ... On what versions of hadoop apart frm
> 0.19.x
> > > can i run this hbase version ...
> > >
> > > is it 0.20.x or 0.18.x ..
> > >
> > > Thanks in advance
> > >
>
> _
> Sadece e-posta iletilerinden daha fazlası: Diğer Windows Live(tm)
> özelliklerine göz atın.
> http://www.microsoft.com/turkiye/windows/windowslive/


Re: Connection failure to HBase

2009-08-03 Thread Jean-Daniel Cryans
If this is all of your hbase-site.xml, you're not using Hadoop at all.
Please review the Pseudo-distributed documentation for HBase.

J-D

2009/8/3 Onur AKTAS :
>
> I have changed hbase-site.xml as below, and it now works (in Local mode). Its 
> something about Hadoop maybe?
>
> 
>  
>    hbase.master
>    localhost:6
>    The directory shared by region servers.
>    
>  
>  
>    hbase.regionserver
>    localhost:60020
>  
> 
>
>
>> Date: Sun, 2 Aug 2009 19:25:16 -0700
>> Subject: Re: Connection failure to HBase
>> From: vpura...@gmail.com
>> To: hbase-user@hadoop.apache.org
>>
>> You can set hbase.master property on the configuration object:
>>
>> config.set("hbase.master", "localhost:9000");
>>
>> Regards,
>> Vaibhav
>>
>> 2009/8/2 Onur AKTAS 
>>
>> >
>> > Hi,
>> >
>> > I have just installed Hadoop 19.3 (pseudo distributed mode) and Hbase 19.2
>> > by following the instructions.
>> > Both of them starts fine.
>> >
>> > Hadoop Log:
>> > $ bin/start-all.sh
>> > starting namenode, logging to
>> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
>> > localhost: starting datanode, logging to
>> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
>> > 
>> >
>> > HBase Log:
>> > $ bin/start-hbase.sh
>> > starting master, logging to
>> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-master-localhost.localdomain.out
>> > localhost:
>> > starting regionserver, logging to
>> >
>> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-regionserver-localhost.localdomain.out
>> >
>> > When I try to connect HBase from a client, it gives an error as:
>> >
>> > Aug 3, 2009 3:35:04 AM org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>> > handleConnectionFailure
>> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already tried
>> > 0 time
>> >  (s).
>> > Aug 3, 2009 3:35:05 AM org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>> > handleConnectionFailure
>> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already tried
>> > 1 time(s).
>> >
>> > I
>> > have configured sites.xml etc as "localhost:9000", How can I change
>> > that 6 port in client? I use like below in my Java class.
>> > HBaseConfiguration config = new HBaseConfiguration();
>> >
>> > Thanks.
>> >
>> > _
>> > Teker teker mi, yoksa hepsi birden mi? Arkadaşlarınızla ilgili güncel
>> > bilgileri tek bir yerden edinin.
>> >
>> > http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx
>
> _
> Windows Live ile fotoğraflarınızı organize edebilir, düzenleyebilir ve 
> paylaşabilirsiniz.
> http://www.microsoft.com/turkiye/windows/windowslive/products/photo-gallery-edit.aspx


RE: Connection failure to HBase

2009-08-03 Thread Onur AKTAS

No, this is what after I changed.

I was using like below, but it was not working. It was giving an exception like 
"INFO: Retrying connect to server: localhost/127.0.0.1:6. Already tried"

hbase.rootdir
hdfs://localhost:9000/hbase
The directory shared by region servers.




> Date: Mon, 3 Aug 2009 08:36:09 -0400
> Subject: Re: Connection failure to HBase
> From: jdcry...@apache.org
> To: hbase-user@hadoop.apache.org
> 
> If this is all of your hbase-site.xml, you're not using Hadoop at all.
> Please review the Pseudo-distributed documentation for HBase.
> 
> J-D
> 
> 2009/8/3 Onur AKTAS :
> >
> > I have changed hbase-site.xml as below, and it now works (in Local mode). 
> > Its something about Hadoop maybe?
> >
> > 
> >  
> >hbase.master
> >localhost:6
> >The directory shared by region servers.
> >
> >  
> >  
> >hbase.regionserver
> >localhost:60020
> >  
> > 
> >
> >
> >> Date: Sun, 2 Aug 2009 19:25:16 -0700
> >> Subject: Re: Connection failure to HBase
> >> From: vpura...@gmail.com
> >> To: hbase-user@hadoop.apache.org
> >>
> >> You can set hbase.master property on the configuration object:
> >>
> >> config.set("hbase.master", "localhost:9000");
> >>
> >> Regards,
> >> Vaibhav
> >>
> >> 2009/8/2 Onur AKTAS 
> >>
> >> >
> >> > Hi,
> >> >
> >> > I have just installed Hadoop 19.3 (pseudo distributed mode) and Hbase 
> >> > 19.2
> >> > by following the instructions.
> >> > Both of them starts fine.
> >> >
> >> > Hadoop Log:
> >> > $ bin/start-all.sh
> >> > starting namenode, logging to
> >> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
> >> > localhost: starting datanode, logging to
> >> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
> >> > 
> >> >
> >> > HBase Log:
> >> > $ bin/start-hbase.sh
> >> > starting master, logging to
> >> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-master-localhost.localdomain.out
> >> > localhost:
> >> > starting regionserver, logging to
> >> >
> >> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-regionserver-localhost.localdomain.out
> >> >
> >> > When I try to connect HBase from a client, it gives an error as:
> >> >
> >> > Aug 3, 2009 3:35:04 AM org.apache.hadoop.hbase.ipc.HBaseClient$Connection
> >> > handleConnectionFailure
> >> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already 
> >> > tried
> >> > 0 time
> >> >  (s).
> >> > Aug 3, 2009 3:35:05 AM org.apache.hadoop.hbase.ipc.HBaseClient$Connection
> >> > handleConnectionFailure
> >> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already 
> >> > tried
> >> > 1 time(s).
> >> >
> >> > I
> >> > have configured sites.xml etc as "localhost:9000", How can I change
> >> > that 6 port in client? I use like below in my Java class.
> >> > HBaseConfiguration config = new HBaseConfiguration();
> >> >
> >> > Thanks.
> >> >
> >> > _
> >> > Teker teker mi, yoksa hepsi birden mi? Arkadaşlarınızla ilgili güncel
> >> > bilgileri tek bir yerden edinin.
> >> >
> >> > http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx
> >
> > _
> > Windows Live ile fotoğraflarınızı organize edebilir, düzenleyebilir ve 
> > paylaşabilirsiniz.
> > http://www.microsoft.com/turkiye/windows/windowslive/products/photo-gallery-edit.aspx

_
Windows Live tüm arkadaşlarınızla tek bir yerden iletişim kurmanıza yardımcı 
olur.
http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx

Re: Connection failure to HBase

2009-08-03 Thread Jean-Daniel Cryans
If the client is not able to talk to the Master, it means that
something wrong happened there that prevents it to start. Look in the
master's log you should see an exception.

J-D

2009/8/3 Onur AKTAS :
>
> No, this is what after I changed.
>
> I was using like below, but it was not working. It was giving an exception 
> like "INFO: Retrying connect to server: localhost/127.0.0.1:6. Already 
> tried"
> 
>    hbase.rootdir
>    hdfs://localhost:9000/hbase
>    The directory shared by region servers.
>    
> 
>
>
>> Date: Mon, 3 Aug 2009 08:36:09 -0400
>> Subject: Re: Connection failure to HBase
>> From: jdcry...@apache.org
>> To: hbase-user@hadoop.apache.org
>>
>> If this is all of your hbase-site.xml, you're not using Hadoop at all.
>> Please review the Pseudo-distributed documentation for HBase.
>>
>> J-D
>>
>> 2009/8/3 Onur AKTAS :
>> >
>> > I have changed hbase-site.xml as below, and it now works (in Local mode). 
>> > Its something about Hadoop maybe?
>> >
>> > 
>> >  
>> >    hbase.master
>> >    localhost:6
>> >    The directory shared by region servers.
>> >    
>> >  
>> >  
>> >    hbase.regionserver
>> >    localhost:60020
>> >  
>> > 
>> >
>> >
>> >> Date: Sun, 2 Aug 2009 19:25:16 -0700
>> >> Subject: Re: Connection failure to HBase
>> >> From: vpura...@gmail.com
>> >> To: hbase-user@hadoop.apache.org
>> >>
>> >> You can set hbase.master property on the configuration object:
>> >>
>> >> config.set("hbase.master", "localhost:9000");
>> >>
>> >> Regards,
>> >> Vaibhav
>> >>
>> >> 2009/8/2 Onur AKTAS 
>> >>
>> >> >
>> >> > Hi,
>> >> >
>> >> > I have just installed Hadoop 19.3 (pseudo distributed mode) and Hbase 
>> >> > 19.2
>> >> > by following the instructions.
>> >> > Both of them starts fine.
>> >> >
>> >> > Hadoop Log:
>> >> > $ bin/start-all.sh
>> >> > starting namenode, logging to
>> >> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
>> >> > localhost: starting datanode, logging to
>> >> > /hda3/ps/hadoop-0.19.2/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
>> >> > 
>> >> >
>> >> > HBase Log:
>> >> > $ bin/start-hbase.sh
>> >> > starting master, logging to
>> >> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-master-localhost.localdomain.out
>> >> > localhost:
>> >> > starting regionserver, logging to
>> >> >
>> >> > /hda3/ps/hbase-0.19.3/bin/../logs/hbase-oracle-regionserver-localhost.localdomain.out
>> >> >
>> >> > When I try to connect HBase from a client, it gives an error as:
>> >> >
>> >> > Aug 3, 2009 3:35:04 AM 
>> >> > org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>> >> > handleConnectionFailure
>> >> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already 
>> >> > tried
>> >> > 0 time
>> >> >  (s).
>> >> > Aug 3, 2009 3:35:05 AM 
>> >> > org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>> >> > handleConnectionFailure
>> >> > INFO: Retrying connect to server: localhost/127.0.0.1:6. Already 
>> >> > tried
>> >> > 1 time(s).
>> >> >
>> >> > I
>> >> > have configured sites.xml etc as "localhost:9000", How can I change
>> >> > that 6 port in client? I use like below in my Java class.
>> >> > HBaseConfiguration config = new HBaseConfiguration();
>> >> >
>> >> > Thanks.
>> >> >
>> >> > _
>> >> > Teker teker mi, yoksa hepsi birden mi? Arkadaşlarınızla ilgili güncel
>> >> > bilgileri tek bir yerden edinin.
>> >> >
>> >> > http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx
>> >
>> > _
>> > Windows Live ile fotoğraflarınızı organize edebilir, düzenleyebilir ve 
>> > paylaşabilirsiniz.
>> > http://www.microsoft.com/turkiye/windows/windowslive/products/photo-gallery-edit.aspx
>
> _
> Windows Live tüm arkadaşlarınızla tek bir yerden iletişim kurmanıza yardımcı 
> olur.
> http://www.microsoft.com/turkiye/windows/windowslive/products/social-network-connector.aspx


RE: Connection failure to HBase

2009-08-03 Thread Onur AKTAS

Here is what I do.

Pseudo-Distributed Operation in: 
http://hadoop.apache.org/common/docs/current/quickstart.html 
I edit 

conf/core-site.xml, 
conf/hdfs-site.xm, 
conf/mapred-site.xml:

  
$ bin/hadoop namenode -format

  
$ bin/start-all.shstarting namenode, logging to 
/hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
localhost: starting datanode, logging to 
/hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to 
/hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to 
/hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to 
/hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-tasktracker-localhost.localdomain.out



When I check the logs in hadoop-oracle-namenode-localhost.localdomain.log
 I see something like
2009-08-03 21:26:20,757 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 
on 9000, call addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: File 
/tmp/hadoop-oracle/mapred/system/jobtracker.info could only be replicated to 0 
nodes, instead of 1
java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
could only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2009-08-03 21:26:21,177 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 
on 9000, call addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: File 
/tmp/hadoop-oracle/mapred/system/jobtracker.info could only be replicated to 0 
nodes, instead of 1
java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
could only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2009-08-03 21:26:21,982 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 
on 9000, call addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: File 
/tmp/hadoop-oracle/mapred/system/jobtracker.info could only be replicated to 0 
nodes, instead of 1
java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
could only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

Re: Connection failure to HBase

2009-08-03 Thread Jean-Daniel Cryans
I see many problems here.

First, it seems you are trying to use HBase 0.19 with Hadoop 0.20. As
the HBase 0.19 doc says, it's only working on Hadoop 0.19.x. Also in
your first email you told us that you are using Hadoop 0.19.3 (which
btw isn't released and 0.19.2 just was) so that's quite confusing.

Also Hadoop by defaults writes in /tmp/hadoop-#{username} so there
must be something wrong in your configuration if it's trying use it as
the filesystem. The exception you see normally means that the Namenode
wasn't able to assign data to Datanodes at all. Please confirm that
your Hadoop configuration is ok and further Hadoop-related questions
should be directed at their mailing list.

After you sorted out these problems, it will be much easier to run HBase.

Cheers,

J-D

2009/8/3 Onur AKTAS :
>
> Here is what I do.
>
> Pseudo-Distributed Operation in: 
> http://hadoop.apache.org/common/docs/current/quickstart.html
> I edit
>
> conf/core-site.xml,
> conf/hdfs-site.xm,
> conf/mapred-site.xml:
>
>
> $ bin/hadoop namenode -format
>
>
> $ bin/start-all.shstarting namenode, logging to 
> /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
> localhost: starting datanode, logging to 
> /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
> localhost: starting secondarynamenode, logging to 
> /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-secondarynamenode-localhost.localdomain.out
> starting jobtracker, logging to 
> /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-jobtracker-localhost.localdomain.out
> localhost: starting tasktracker, logging to 
> /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-tasktracker-localhost.localdomain.out
>
>
>
> When I check the logs in hadoop-oracle-namenode-localhost.localdomain.log
>  I see something like
> 2009-08-03 21:26:20,757 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 3 on 9000, call addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
> DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: File 
> /tmp/hadoop-oracle/mapred/system/jobtracker.info could only be replicated to 
> 0 nodes, instead of 1
> java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
> could only be replicated to 0 nodes, instead of 1
>    at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
>    at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at javax.security.auth.Subject.doAs(Subject.java:396)
>    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 2009-08-03 21:26:21,177 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 5 on 9000, call addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
> DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: File 
> /tmp/hadoop-oracle/mapred/system/jobtracker.info could only be replicated to 
> 0 nodes, instead of 1
> java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
> could only be replicated to 0 nodes, instead of 1
>    at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
>    at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at javax.security.auth.Subject.doAs(Subject.java:396)
>    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 2009-08-03 21:26:21,982 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 6 on 9000, call addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
> DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: File 
> /tmp/hadoop-oracle/mapred/system/jobtracker.info could only be replicated to 
> 0 nodes, instead of 1
> java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
> could only be repl

RE: Connection failure to HBase

2009-08-03 Thread Onur AKTAS

Sorry, I was trying with Hadoop 0.19.2 and with HBase 0.19.3 (I wrote Hadoop 
0.19.3 and HBase 0.19.2 by mistake).
Anyway, now I try with Hadoop 0.20.0  and HBase 0.20.0. 

Here are my Hadoop configuration files.

core-site.xml:

  
fs.default.name
hdfs://localhost:9000
  


hdf-site.xml:

  
dfs.replication
1
  


mapred-site.xml:

  
mapred.job.tracker
localhost:9001
  


Here are my HBase configuration file.
hbase-site.xml

  
hbase.rootdir
hdfs://localhost:9000/hbase
The directory shared by region servers.

  


Thats all, HBase works fine when I try to use it standalone, but it could not 
make it work on Hadoop pseudo-distributed mode.
I'm going to ask Hadoop list too..
Thanks.

> Date: Mon, 3 Aug 2009 14:39:33 -0400
> Subject: Re: Connection failure to HBase
> From: jdcry...@apache.org
> To: hbase-user@hadoop.apache.org
> 
> I see many problems here.
> 
> First, it seems you are trying to use HBase 0.19 with Hadoop 0.20. As
> the HBase 0.19 doc says, it's only working on Hadoop 0.19.x. Also in
> your first email you told us that you are using Hadoop 0.19.3 (which
> btw isn't released and 0.19.2 just was) so that's quite confusing.
> 
> Also Hadoop by defaults writes in /tmp/hadoop-#{username} so there
> must be something wrong in your configuration if it's trying use it as
> the filesystem. The exception you see normally means that the Namenode
> wasn't able to assign data to Datanodes at all. Please confirm that
> your Hadoop configuration is ok and further Hadoop-related questions
> should be directed at their mailing list.
> 
> After you sorted out these problems, it will be much easier to run HBase.
> 
> Cheers,
> 
> J-D
> 
> 2009/8/3 Onur AKTAS :
> >
> > Here is what I do.
> >
> > Pseudo-Distributed Operation in: 
> > http://hadoop.apache.org/common/docs/current/quickstart.html
> > I edit
> >
> > conf/core-site.xml,
> > conf/hdfs-site.xm,
> > conf/mapred-site.xml:
> >
> >
> > $ bin/hadoop namenode -format
> >
> >
> > $ bin/start-all.shstarting namenode, logging to 
> > /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-namenode-localhost.localdomain.out
> > localhost: starting datanode, logging to 
> > /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-datanode-localhost.localdomain.out
> > localhost: starting secondarynamenode, logging to 
> > /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-secondarynamenode-localhost.localdomain.out
> > starting jobtracker, logging to 
> > /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-jobtracker-localhost.localdomain.out
> > localhost: starting tasktracker, logging to 
> > /hda3/ps/hadoop-0.20.0/bin/../logs/hadoop-oracle-tasktracker-localhost.localdomain.out
> >
> >
> >
> > When I check the logs in hadoop-oracle-namenode-localhost.localdomain.log
> >  I see something like
> > 2009-08-03 21:26:20,757 INFO org.apache.hadoop.ipc.Server: IPC Server 
> > handler 3 on 9000, call 
> > addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
> > DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: 
> > File /tmp/hadoop-oracle/mapred/system/jobtracker.info could only be 
> > replicated to 0 nodes, instead of 1
> > java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
> > could only be replicated to 0 nodes, instead of 1
> >at 
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
> >at 
> > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at 
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >at 
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >at java.lang.reflect.Method.invoke(Method.java:597)
> >at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> >at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
> >at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
> >at java.security.AccessController.doPrivileged(Native Method)
> >at javax.security.auth.Subject.doAs(Subject.java:396)
> >at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> > 2009-08-03 21:26:21,177 INFO org.apache.hadoop.ipc.Server: IPC Server 
> > handler 5 on 9000, call 
> > addBlock(/tmp/hadoop-oracle/mapred/system/jobtracker.info, 
> > DFSClient_-1600979110) from 127.0.0.1:22460: error: java.io.IOException: 
> > File /tmp/hadoop-oracle/mapred/system/jobtracker.info could only be 
> > replicated to 0 nodes, instead of 1
> > java.io.IOException: File /tmp/hadoop-oracle/mapred/system/jobtracker.info 
> > could only be replicated to 0 nodes, instead of 1
> >at 
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
> >at 
> > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at

Re: Scanner Performance Question / Observation

2009-08-03 Thread Xinan Wu
6 sec isn't crazy with 0.19. If you really want to research it, have a
look at where the time is spent, creating scanner or actually doing
the scanning. I think it's the former. That being said, upgrading to
0.20 is a much quicker solution. Scanner has been optimized in the new
version.

On Mon, Aug 3, 2009 at 12:01 AM, Kyle Oba wrote:
> Hey folks,
>
> I have a question (possibly naive) about a scans performance.
>
> My scan is taking about 6 seconds to do the following:
>
> 470 rows extracted
> total size of all rows together is about 1.4 megs
>
> I'm using a InclusiveStopRow filter to limit rows which are being
> extracted by the scanner.  The problem for me is that I will soon be
> scanning 30x this amount of data soon, and 6x30 seconds will be quite
> a hit.
>
> We're already planning to jump to HBase v0.20 soon.  But, wanted to
> see if folks thought the 6 second timing on this scan was reasonable,
> or horribly off.
>
> Thanks.
>
> Kyle
>


stargate performance evaluation

2009-08-03 Thread Haijun Cao
I am evaluating the performance of stargate (which btw, is a
great contrib to hbase, thanks!). The evaluation program is mostly a simple
modification to the existing PerformanceEvaluation program, just replace java
client with stargate client and get value as protobuf. 
 
All of the software (hadoop, zookeeper, hbase, jetty) are
installed on one box. The data set is small, therefore all data are served out
of memory.
 
For random read test, with java client (the existing PE
program), I can get 19K/s, with stargate client,  I can only get 3-4k/s.
In both case, pe program run with 100 threads. Increasing number of threads
does not seem to help (even hurt the throughput).
 
I am just wondering if this is expected ( I can’t figure out
in theory why the throughput drop)? Any idea of possible 
optimization/configuration change to increase the throughput?
 
Thanks!

Haijun Cao


  

Problem with TableInputFormat - HBase 0.20

2009-08-03 Thread Lucas Nazário dos Santos
Hi,

I'm migrating from HBase 0.19 to version 0.20 and facing an error regarding
the TableInputFormat class. Bellow is how I'm setting up the job and also
the error message I'm getting.

Does anybody have a clue on what may be happening? It used to work on HBase
0.19.

Lucas


this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
this.configuration.set(TableInputFormat.SCAN, "date");
this.configuration.set("index.name", args[1]);
this.configuration.set("hbase.master", args[2]);
this.configuration.set("index.replication.level", args[3]);

final Job jobConf = new Job(this.configuration);
jobConf.setJarByClass(Indexer.class);
jobConf.setJobName("NInvestNewsIndexer");

FileInputFormat.setInputPaths(jobConf, new Path(args[0]));

jobConf.setInputFormatClass(TableInputFormat.class);
jobConf.setOutputFormatClass(NullOutputFormat.class);

jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(Text.class);

jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);




09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
at
org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Exception in thread "main" java.lang.NullPointerException
at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


Re: Problem with TableInputFormat - HBase 0.20

2009-08-03 Thread stack
Looks like crossed lines.

In hadoop 0.20.0, there is the mapred package and the mapreduce package.
The latter has the new lump-sum context to which you go for all things.
HBase has similar.  The new mapreduce package that is in 0.20.0 hbase is the
old mapred redone to fit the new hadoop APIs.  Below in your stacktrace I
see use of the new hbase mapreduce stuff though you would hone to the old
interface.  Try using the stuff in mapred package?

St.Ack


On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
nazario.lu...@gmail.com> wrote:

> Hi,
>
> I'm migrating from HBase 0.19 to version 0.20 and facing an error regarding
> the TableInputFormat class. Bellow is how I'm setting up the job and also
> the error message I'm getting.
>
> Does anybody have a clue on what may be happening? It used to work on HBase
> 0.19.
>
> Lucas
>
>
> this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> this.configuration.set(TableInputFormat.SCAN, "date");
> this.configuration.set("index.name", args[1]);
> this.configuration.set("hbase.master", args[2]);
> this.configuration.set("index.replication.level", args[3]);
>
> final Job jobConf = new Job(this.configuration);
> jobConf.setJarByClass(Indexer.class);
> jobConf.setJobName("NInvestNewsIndexer");
>
> FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
>
> jobConf.setInputFormatClass(TableInputFormat.class);
> jobConf.setOutputFormatClass(NullOutputFormat.class);
>
> jobConf.setOutputKeyClass(Text.class);
> jobConf.setOutputValueClass(Text.class);
>
> jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
>
>
>
>
> 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
> java.io.EOFException
>at java.io.DataInputStream.readFully(DataInputStream.java:180)
>at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
>at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
>at
>
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
>at
>
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
>at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>at
>
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
>at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Exception in thread "main" java.lang.NullPointerException
>at
>
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
>at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
>at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>


Re: Scanner Performance Question / Observation

2009-08-03 Thread stack
On Mon, Aug 3, 2009 at 1:58 PM, Xinan Wu  wrote:

> 6 sec isn't crazy with 0.19. If you really want to research it, have a
> look at where the time is spent, creating scanner or actually doing
> the scanning. I think it's the former.


You are probably right that it is the fomer.  In 0.19, scanners would open a
new Reader against every file in the region before scanning could start
(trip to namenode, then out to each dn to read in indices, etc.).  In
0.20.0, the already-open files are used.
St.Ack


Re: Problem with TableInputFormat - HBase 0.20

2009-08-03 Thread Lucas Nazário dos Santos
Thanks. But I didn't get it. Why should I stick with the old mapred package
if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old mapred
package are all deprecated.



On Mon, Aug 3, 2009 at 7:31 PM, stack  wrote:

> Looks like crossed lines.
>
> In hadoop 0.20.0, there is the mapred package and the mapreduce package.
> The latter has the new lump-sum context to which you go for all things.
> HBase has similar.  The new mapreduce package that is in 0.20.0 hbase is
> the
> old mapred redone to fit the new hadoop APIs.  Below in your stacktrace I
> see use of the new hbase mapreduce stuff though you would hone to the old
> interface.  Try using the stuff in mapred package?
>
> St.Ack
>
>
> On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> nazario.lu...@gmail.com> wrote:
>
> > Hi,
> >
> > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> regarding
> > the TableInputFormat class. Bellow is how I'm setting up the job and also
> > the error message I'm getting.
> >
> > Does anybody have a clue on what may be happening? It used to work on
> HBase
> > 0.19.
> >
> > Lucas
> >
> >
> > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > this.configuration.set(TableInputFormat.SCAN, "date");
> > this.configuration.set("index.name", args[1]);
> > this.configuration.set("hbase.master", args[2]);
> > this.configuration.set("index.replication.level", args[3]);
> >
> > final Job jobConf = new Job(this.configuration);
> > jobConf.setJarByClass(Indexer.class);
> > jobConf.setJobName("NInvestNewsIndexer");
> >
> > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> >
> > jobConf.setInputFormatClass(TableInputFormat.class);
> > jobConf.setOutputFormatClass(NullOutputFormat.class);
> >
> > jobConf.setOutputKeyClass(Text.class);
> > jobConf.setOutputValueClass(Text.class);
> >
> > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> >
> >
> >
> >
> > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
> > java.io.EOFException
> >at java.io.DataInputStream.readFully(DataInputStream.java:180)
> >at
> org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> >at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> >at
> >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> >at
> >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> >at
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> >at
> >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >at
> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> >at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> >at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >at java.lang.reflect.Method.invoke(Method.java:597)
> >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > Exception in thread "main" java.lang.NullPointerException
> >at
> >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> >at
> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> >at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> >at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >at java.lang.reflect.Method.invoke(Method.java:597)
> >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >
>