Re: Storing extremely large size file

2012-04-17 Thread kim young ill
+1 documentation please

On Wed, Apr 18, 2012 at 7:21 AM, anil gupta  wrote:

> +1 for documentation. It will help a lot of people.
>
> On Tue, Apr 17, 2012 at 10:02 PM, lars hofhansl 
> wrote:
>
> > I disagree. This comes up frequently and some basic guidelines should be
> > documented in the Reference Guide.
> > If it is indeed not difficult than the section is the book will be short.
> >
> >
> >
> > - Original Message -
> > From: Michael Segel 
> > To: "user@hbase.apache.org" 
> > Cc: "user@hbase.apache.org" 
> > Sent: Tuesday, April 17, 2012 3:43 PM
> > Subject: Re: Storing extremely large size file
> >
> > -1. It's a boring topic.
> > And it's one of those things that you either get it right or you end up
> > hiring a voodoo witch doctor to curse the author of the chapter...
> >
> > I agree w Jack, it's not difficult just takes some planning and
> > forethought.
> >
> > Also reading lots of blogs... And some practice...
> >
> >
> > Sent from my iPhone
> >
> > On Apr 17, 2012, at 1:42 PM, "Dave Revell" 
> wrote:
> >
> > > +1 Jack :)
> > >
> > > On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
> > >
> > >> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
> > >> wrote:
> > >>> I think this is a popular topic that might deserve a section in The
> > Book.
> > >>>
> > >>> By "this topic" I mean storing big binary chunks.
> > >>>
> > >>
> > >> Get Jack Levin to write it (smile).
> > >>
> > >> And make sure the values are compressed that you send over from the
> > >> client
> > >>
> > >> St.Ack
> > >>
> >
> >
>
>
> --
> Thanks & Regards,
> Anil Gupta
>


Re: Storing extremely large size file

2012-04-17 Thread anil gupta
+1 for documentation. It will help a lot of people.

On Tue, Apr 17, 2012 at 10:02 PM, lars hofhansl  wrote:

> I disagree. This comes up frequently and some basic guidelines should be
> documented in the Reference Guide.
> If it is indeed not difficult than the section is the book will be short.
>
>
>
> - Original Message -
> From: Michael Segel 
> To: "user@hbase.apache.org" 
> Cc: "user@hbase.apache.org" 
> Sent: Tuesday, April 17, 2012 3:43 PM
> Subject: Re: Storing extremely large size file
>
> -1. It's a boring topic.
> And it's one of those things that you either get it right or you end up
> hiring a voodoo witch doctor to curse the author of the chapter...
>
> I agree w Jack, it's not difficult just takes some planning and
> forethought.
>
> Also reading lots of blogs... And some practice...
>
>
> Sent from my iPhone
>
> On Apr 17, 2012, at 1:42 PM, "Dave Revell"  wrote:
>
> > +1 Jack :)
> >
> > On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
> >
> >> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
> >> wrote:
> >>> I think this is a popular topic that might deserve a section in The
> Book.
> >>>
> >>> By "this topic" I mean storing big binary chunks.
> >>>
> >>
> >> Get Jack Levin to write it (smile).
> >>
> >> And make sure the values are compressed that you send over from the
> >> client
> >>
> >> St.Ack
> >>
>
>


-- 
Thanks & Regards,
Anil Gupta


Re: Storing extremely large size file

2012-04-17 Thread lars hofhansl
I disagree. This comes up frequently and some basic guidelines should be 
documented in the Reference Guide.
If it is indeed not difficult than the section is the book will be short.



- Original Message -
From: Michael Segel 
To: "user@hbase.apache.org" 
Cc: "user@hbase.apache.org" 
Sent: Tuesday, April 17, 2012 3:43 PM
Subject: Re: Storing extremely large size file

-1. It's a boring topic. 
And it's one of those things that you either get it right or you end up hiring 
a voodoo witch doctor to curse the author of the chapter...

I agree w Jack, it's not difficult just takes some planning and forethought.

Also reading lots of blogs... And some practice...


Sent from my iPhone

On Apr 17, 2012, at 1:42 PM, "Dave Revell"  wrote:

> +1 Jack :)
> 
> On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
> 
>> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
>> wrote:
>>> I think this is a popular topic that might deserve a section in The Book.
>>> 
>>> By "this topic" I mean storing big binary chunks.
>>> 
>> 
>> Get Jack Levin to write it (smile).
>> 
>> And make sure the values are compressed that you send over from the
>> client
>> 
>> St.Ack
>> 



Re: Storing extremely large size file

2012-04-17 Thread Weishung Chung
Thank you all. Practice makes perfect :)

On Tue, Apr 17, 2012 at 5:46 PM, Michael Segel wrote:

> In theory, you could go as large as a region size minus the key and
> overhead. (rows can't span regions)
>
> Realistically you'd want to go much smaller.
>
>
> Sent from my iPhone
>
> On Apr 17, 2012, at 1:49 PM, "Wei Shung Chung"  wrote:
>
> > What would be the max affordable size one could have ?
> >
> > Sent from my iPhone
> >
> > On Apr 17, 2012, at 1:42 PM, Dave Revell  wrote:
> >
> >> +1 Jack :)
> >>
> >> On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
> >>
> >>> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
> >>> wrote:
>  I think this is a popular topic that might deserve a section in The
> Book.
> 
>  By "this topic" I mean storing big binary chunks.
> 
> >>>
> >>> Get Jack Levin to write it (smile).
> >>>
> >>> And make sure the values are compressed that you send over from the
> >>> client
> >>>
> >>> St.Ack
> >>>
>


Re: Storing extremely large size file

2012-04-17 Thread Michael Segel
In theory, you could go as large as a region size minus the key and overhead. 
(rows can't span regions)

Realistically you'd want to go much smaller.


Sent from my iPhone

On Apr 17, 2012, at 1:49 PM, "Wei Shung Chung"  wrote:

> What would be the max affordable size one could have ?
> 
> Sent from my iPhone
> 
> On Apr 17, 2012, at 1:42 PM, Dave Revell  wrote:
> 
>> +1 Jack :)
>> 
>> On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
>> 
>>> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
>>> wrote:
 I think this is a popular topic that might deserve a section in The Book.
 
 By "this topic" I mean storing big binary chunks.
 
>>> 
>>> Get Jack Levin to write it (smile).
>>> 
>>> And make sure the values are compressed that you send over from the
>>> client
>>> 
>>> St.Ack
>>> 


Re: Storing extremely large size file

2012-04-17 Thread Michael Segel
-1. It's a boring topic. 
And it's one of those things that you either get it right or you end up hiring 
a voodoo witch doctor to curse the author of the chapter...

I agree w Jack, it's not difficult just takes some planning and forethought.

Also reading lots of blogs... And some practice...


Sent from my iPhone

On Apr 17, 2012, at 1:42 PM, "Dave Revell"  wrote:

> +1 Jack :)
> 
> On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
> 
>> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
>> wrote:
>>> I think this is a popular topic that might deserve a section in The Book.
>>> 
>>> By "this topic" I mean storing big binary chunks.
>>> 
>> 
>> Get Jack Levin to write it (smile).
>> 
>> And make sure the values are compressed that you send over from the
>> client
>> 
>> St.Ack
>> 


Re: regions stuck in transition

2012-04-17 Thread Alex Baranau
I've seen similar behavior  at our cluster too.

>From the top of my head, you can try to restart particular RegionServer,
where those regions belong too (in cases I saw usually single regionserver
was an issue).

Have you tried to access data from that region (e.g. in shell)? I think it
should still be served.

Alex Baranau
--
Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase

On Mon, Apr 16, 2012 at 11:21 AM, Bryan Beaudreault <
bbeaudrea...@hubspot.com> wrote:

> Hello,
>
> We've recently had a problem where regions will get stuck in transition for
> a long period of time.  In fact, they don't ever appear to get
> out-of-transition unless we take manual action.  Last time this happened I
> restarted the master and they were cleared out.  This time I wanted to
> consult the list first.
>
> I checked the admin ui for all 24 of our servers, and the region does not
> appear to be hosted anywhere.  If I look in hdfs, I do see the region there
> and it has 2 files.  The first instance of this region in my HMaster logs
> is:
>
> 2/04/15 17:48:06 INFO master.HMaster: balance
> >
> hri=visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e.,
> > src=X.ec2.internal,60020,1334064456919,
> > dest=.ec2.internal,60020,1334064197946
> > 12/04/15 17:48:06 INFO master.AssignmentManager: Server
> > serverName=.ec2.internal,60020,1334064456919, load=(requests=0,
> > regions=0, usedHeap=0, maxHeap=0) returned
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: Received close for
> >
> visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e.
> > but we are not serving it for 703fed4411f2d6ff4b3ea80506fb635e
>
>
> It then keeps saying the same few logs every ~30 mins:
>
> 12/04/15 18:18:18 INFO master.AssignmentManager: Regions in transition
> > timed out:
> >
>  
> visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e.
> > state=PENDING_CLOSE, ts=1334526491544, server=null
> > 12/04/15 18:18:18 INFO master.AssignmentManager: Region has been
> > PENDING_CLOSE for too long, running forced unassign again on
> >
> region=visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e.
> > 12/04/15 18:18:18 INFO master.AssignmentManager: Server
> > serverName=X.ec2.internal,60020,1334064456919, load=(requests=0,
> > regions=0, usedHeap=0, maxHeap=0) returned
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: Received close for
> >
> visitor-activities-a2,\x00\x02EG120909,1333750824238.703fed4411f2d6ff4b3ea80506fb635e.
> > but we are not serving it for 703fed4411f2d6ff4b3ea80506fb635e
>
>
> Any ideas how I can avoid this, or a better solution than restarting the
> HMaster?
>
> Thanks,
>
> Bryan
>


Re: Storing extremely large size file

2012-04-17 Thread Wei Shung Chung
What would be the max affordable size one could have ?

Sent from my iPhone

On Apr 17, 2012, at 1:42 PM, Dave Revell  wrote:

> +1 Jack :)
> 
> On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:
> 
>> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
>> wrote:
>>> I think this is a popular topic that might deserve a section in The Book.
>>> 
>>> By "this topic" I mean storing big binary chunks.
>>> 
>> 
>> Get Jack Levin to write it (smile).
>> 
>> And make sure the values are compressed that you send over from the
>> client
>> 
>> St.Ack
>> 


Re: Storing extremely large size file

2012-04-17 Thread Dave Revell
+1 Jack :)

On Tue, Apr 17, 2012 at 11:38 AM, Stack  wrote:

> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell 
> wrote:
> > I think this is a popular topic that might deserve a section in The Book.
> >
> > By "this topic" I mean storing big binary chunks.
> >
>
> Get Jack Levin to write it (smile).
>
> And make sure the values are compressed that you send over from the
> client
>
> St.Ack
>


Re: Storing extremely large size file

2012-04-17 Thread Stack
On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell  wrote:
> I think this is a popular topic that might deserve a section in The Book.
>
> By "this topic" I mean storing big binary chunks.
>

Get Jack Levin to write it (smile).

And make sure the values are compressed that you send over from the client

St.Ack


Re: manually splitting Hlogs

2012-04-17 Thread Chris Tarnas
Hi Stack,

Thanks, I have all the regions picked up now.

This particular cluster is on CDH3b4 (long story) but it is slated to be 
upgraded next week.

Here is a clip from the master log on the timeout:

2012-04-14 09:44:24,110 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
for block blk_-4506327502711501968_8582436 failed  because recovery from 
primary datanode 192.168.1.12:50010 failed 1 times.  Pipeline was 
192.168.1.24:50010, 192.168.1.12:50010, 192.168.1.22:50010. Will retry...
2012-04-14 09:45:25,118 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery 
attempt #1 from primary datanode 192.168.1.12:50010
java.net.SocketTimeoutException: Call to /192.168.1.12:50020 failed on socket 
timeout exception: java.net.SocketTimeoutException: 6 millis timeout while 
waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/192.168.1.31:35014 
remote=/192.168.1.12:50020]


The corresponding data node, 192.168.1.12,  had this:

2012-04-14 09:44:24,171 WARN 
org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to 
getBlockMetaDataInfo for block (=blk_-4506327502711501968_8582436) from 
datanode (=192.168.1.22:50010)
java.net.SocketTimeoutException: Call to /192.168.1.22:50020 failed on socket 
timeout exception: java.net.SocketTimeoutException: 6 millis timeout while 
waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/192.168.1.12:36626 
remote=/192.168.1.22:50020]

192.168.1.22 was the node that died. I can put more logs in paste bin if needed.

thanks,
-chris


On Apr 14, 2012, at 3:23 PM, Stack wrote:

> On Sat, Apr 14, 2012 at 9:49 AM, Chris Tarnas  wrote:
>> I looked into org.apache.hadoop.hbase.regionserver.wal.HLog --split  and 
>> didn't see any notes about not running on a live cluster so I ran it and it 
>> ran fine.  Was it safe to run with hbase up? Were the newly created files 
>> correctly added to the existing regions?
>> 
> 
> Should be fine w/ hbase up -- thats how the split is usually done.
> 
> The newly created files were not added to the regions is my guess
> since we only check for their presence on region open.
> 
> Can you see what new files were made and where?  Reassign those
> regions and that should pick up the edits made by your split.
> 
> What version of hbase Chris?
> 
> St.Ack



Re: Storing extremely large size file

2012-04-17 Thread Dave Revell
I think this is a popular topic that might deserve a section in The Book.

By "this topic" I mean storing big binary chunks.

-Dave

On Tue, Apr 17, 2012 at 11:07 AM, kim young ill wrote:

> i plan to move some data from relational-db to hbase, most of them are
> binary with some hundreds KBs, would like to hear some
> best-practice/tuning  for storing this kind of data
> thanx
>
> On Tue, Apr 17, 2012 at 7:50 PM, Jean-Daniel Cryans  >wrote:
>
> > Yes, you fine tuned it properly :)  But in general I wouldn't
> > recommend it to new users.
> >
> > J-D
> >
> > On Tue, Apr 17, 2012 at 10:47 AM, Jack Levin  wrote:
> > > Whats wrong with that size?  We store > 15MB routinely into our image
> > hbase.
> > >
> > > -Jack
> > >
> > > On Tue, Apr 17, 2012 at 10:46 AM, Jean-Daniel Cryans
> > >  wrote:
> > >> Make sure the config is changed client-side not server-side.
> > >>
> > >> Also you might not want to store 12MB values in HBase.
> > >>
> > >> J-D
> > >>
> > >> On Tue, Apr 17, 2012 at 6:06 AM, vishnupriyaa 
> > wrote:
> > >>>
> > >>> I want to save a file of size 12MB but an exception occuring like
> this
> > >>> KeyValue size too large.
> > >>> I have set the value of hbase.client.keyvalue.maxsize in
> > hbase-site.xml and
> > >>> hbase-default.xml to 3GB
> > >>> but the default value 10MB is taking for
> > hbase.client.keyvalue.maxsize.How
> > >>> could I change the value of hbase.client.keyvalue.maxsize or how to
> > store
> > >>> the file of extremely large size.
> > >>> --
> > >>> View this message in context:
> >
> http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
> > >>> Sent from the HBase User mailing list archive at Nabble.com.
> > >>>
> >
>


Re: Storing extremely large size file

2012-04-17 Thread kim young ill
i plan to move some data from relational-db to hbase, most of them are
binary with some hundreds KBs, would like to hear some
best-practice/tuning  for storing this kind of data
thanx

On Tue, Apr 17, 2012 at 7:50 PM, Jean-Daniel Cryans wrote:

> Yes, you fine tuned it properly :)  But in general I wouldn't
> recommend it to new users.
>
> J-D
>
> On Tue, Apr 17, 2012 at 10:47 AM, Jack Levin  wrote:
> > Whats wrong with that size?  We store > 15MB routinely into our image
> hbase.
> >
> > -Jack
> >
> > On Tue, Apr 17, 2012 at 10:46 AM, Jean-Daniel Cryans
> >  wrote:
> >> Make sure the config is changed client-side not server-side.
> >>
> >> Also you might not want to store 12MB values in HBase.
> >>
> >> J-D
> >>
> >> On Tue, Apr 17, 2012 at 6:06 AM, vishnupriyaa 
> wrote:
> >>>
> >>> I want to save a file of size 12MB but an exception occuring like this
> >>> KeyValue size too large.
> >>> I have set the value of hbase.client.keyvalue.maxsize in
> hbase-site.xml and
> >>> hbase-default.xml to 3GB
> >>> but the default value 10MB is taking for
> hbase.client.keyvalue.maxsize.How
> >>> could I change the value of hbase.client.keyvalue.maxsize or how to
> store
> >>> the file of extremely large size.
> >>> --
> >>> View this message in context:
> http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
> >>> Sent from the HBase User mailing list archive at Nabble.com.
> >>>
>


Re: Storing extremely large size file

2012-04-17 Thread Jean-Daniel Cryans
Yes, you fine tuned it properly :)  But in general I wouldn't
recommend it to new users.

J-D

On Tue, Apr 17, 2012 at 10:47 AM, Jack Levin  wrote:
> Whats wrong with that size?  We store > 15MB routinely into our image hbase.
>
> -Jack
>
> On Tue, Apr 17, 2012 at 10:46 AM, Jean-Daniel Cryans
>  wrote:
>> Make sure the config is changed client-side not server-side.
>>
>> Also you might not want to store 12MB values in HBase.
>>
>> J-D
>>
>> On Tue, Apr 17, 2012 at 6:06 AM, vishnupriyaa  wrote:
>>>
>>> I want to save a file of size 12MB but an exception occuring like this
>>> KeyValue size too large.
>>> I have set the value of hbase.client.keyvalue.maxsize in hbase-site.xml and
>>> hbase-default.xml to 3GB
>>> but the default value 10MB is taking for hbase.client.keyvalue.maxsize.How
>>> could I change the value of hbase.client.keyvalue.maxsize or how to store
>>> the file of extremely large size.
>>> --
>>> View this message in context: 
>>> http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>


Re: Storing extremely large size file

2012-04-17 Thread Jack Levin
Whats wrong with that size?  We store > 15MB routinely into our image hbase.

-Jack

On Tue, Apr 17, 2012 at 10:46 AM, Jean-Daniel Cryans
 wrote:
> Make sure the config is changed client-side not server-side.
>
> Also you might not want to store 12MB values in HBase.
>
> J-D
>
> On Tue, Apr 17, 2012 at 6:06 AM, vishnupriyaa  wrote:
>>
>> I want to save a file of size 12MB but an exception occuring like this
>> KeyValue size too large.
>> I have set the value of hbase.client.keyvalue.maxsize in hbase-site.xml and
>> hbase-default.xml to 3GB
>> but the default value 10MB is taking for hbase.client.keyvalue.maxsize.How
>> could I change the value of hbase.client.keyvalue.maxsize or how to store
>> the file of extremely large size.
>> --
>> View this message in context: 
>> http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>


Re: Storing extremely large size file

2012-04-17 Thread Jean-Daniel Cryans
Make sure the config is changed client-side not server-side.

Also you might not want to store 12MB values in HBase.

J-D

On Tue, Apr 17, 2012 at 6:06 AM, vishnupriyaa  wrote:
>
> I want to save a file of size 12MB but an exception occuring like this
> KeyValue size too large.
> I have set the value of hbase.client.keyvalue.maxsize in hbase-site.xml and
> hbase-default.xml to 3GB
> but the default value 10MB is taking for hbase.client.keyvalue.maxsize.How
> could I change the value of hbase.client.keyvalue.maxsize or how to store
> the file of extremely large size.
> --
> View this message in context: 
> http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Need help on using hbase on EC2

2012-04-17 Thread Xin Liu
Hi there,

I setup hadoop and hbase on top of EC2 in Pseudo-distributed mode. I
can use hbase shell to connect. However, when I use java client to
connect, I get the following error at client:

12/04/17 10:21:06 INFO zookeeper.RecoverableZooKeeper: The identifier
of this process is 9078@localhost.localdomain
12/04/17 10:21:06 INFO client.ZooKeeperSaslClient: Client will not
SASL-authenticate because the default JAAS configuration section
'Client' could not be found. If you are not using SASL, you may ignore
this. On the other hand, if you expected SASL to work, please fix your
JAAS configuration.
12/04/17 10:21:06 INFO zookeeper.ClientCnxn: Socket connection
established to domU-12-31-39-12-FA-0A.compute-1.internal/23.22.15.27:2181,
initiating session
12/04/17 10:21:06 INFO zookeeper.ClientCnxn: Session establishment
complete on server
domU-12-31-39-12-FA-0A.compute-1.internal/23.22.15.27:2181, sessionid
= 0x136c1355db70006, negotiated timeout = 4
12/04/17 10:21:07 INFO
client.HConnectionManager$HConnectionImplementation: getMaster attempt
0 of 10 failed; retrying after sleep of 1000
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)

In hbase log, I see:

2012-04-17 17:21:01,172 INFO
org.apache.zookeeper.server.ZooKeeperServer: Client attempting to
establish new session at /171.69.154.55:60201
2012-04-17 17:21:01,175 INFO
org.apache.zookeeper.server.ZooKeeperServer: Established session
0x136c1355db70006 with negotiated timeout 4 for client
/171.69.154.55:60201
2012-04-17 17:21:04,090 WARN
org.apache.zookeeper.server.NIOServerCnxn: caught end of stream
exception
EndOfStreamException: Unable to read additional data from client
sessionid 0x136c1355db70006, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:224)
at java.lang.Thread.run(Thread.java:722)
2012-04-17 17:21:04,094 INFO
org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
for client /171.69.154.55:60201 which had sessionid 0x136c1355db70006

I have opened EC2 security list for all TCP traffic. I used EC2
private DNS for zk quorum. At client, the private DNS is also used. An
IP to private DNS mapping is added to my /etc/hosts

Can someone please help? I think I'm missing a small thing but
couldn't figure out...

Thanks,
Xin


Re: hbase coprocessor unit testing

2012-04-17 Thread Alex Baranau
I don't think that your error is related to CPs stuff. What lib versions do
you use? Can you compare with those of the HBaseHUT pom?

Re 127.0.1.1 vs 127.0.0.1 - what your hosts file looked like before and
now? I think it's just the issue with resolving IP - at one place it
resolves using localhost, at other - your hostname. Since (I suppose) those
two didn't match - you got error.

Alex Baranau
--
Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase

On Tue, Apr 17, 2012 at 9:34 AM, Marcin Cylke  wrote:

> On 17/04/12 15:15, Alex Baranau wrote:
>
> Hi
>
> > Some sanity checks:
> > 1) make sure you don't have 127.0.1.1 in your /etc/hosts (only 127.0.0.1)
>
> I've removed this entry and it worked right away :) Could You explain
> why it did so big difference?
>
> Now the test from HBaseHUT works fine, but mine code is still failing:
>
> #v+
> 2012-04-17 15:26:27,870 [localhost:)] WARN  ClientCnxn
>  :1063 - Session 0x0 for server null, unexpected error, closing
> socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>at
>
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
>at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1041)
> 2012-04-17 15:26:27,871 [dler 2 on 35003] INFO  RecoverableZooKeeper
>  :89 - The identifier of this process is 2032@correspondence
> 2012-04-17 15:26:27,973 [dler 2 on 35003] WARN  RecoverableZooKeeper
>  :159 - Possibly transient ZooKeeper exception:
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/master
> 2012-04-17 15:26:27,974 [dler 2 on 35003] INFO  RetryCounter
>  :53 - The 1 times to retry  after sleeping 2000 ms
> 2012-04-17 15:26:28,973 [localhost:)] INFO  ClientCnxn
>  :933 - Opening socket connection to server localhost/127.0.0.1:
> #v-
>
> My whole test is something like this:
>
> #v+
>  testingUtility.getConfiguration().setStrings(
>   CoprocessorHost.USER_REGION_COPROCESSOR_CONF_KEY,
>   AuxDataCalculator.class.getName());
>  testingUtility.startMiniCluster();
>
> byte[] TABLE = Bytes.toBytes(getClass().getName());
> byte[] A = Bytes.toBytes("A");
> byte[] STATS = Bytes.toBytes("stats");
> byte[] CONTENT = Bytes.toBytes("content");
> byte[][] FAMILIES = new byte[][] { A, STATS, CONTENT } ;
>
> HTable hTable = testingUtility.createTable(TABLE, FAMILIES);
> Put put = new Put(ROW);
> put.add(A, A, A);
>
> hTable.put(put);
>
> Get get = new Get(ROW);
> Result result = hTable.get(get);
> #v-
>
>
> As I don't see any particular differences between Your unit test and
> mine, could You look into this a bit more?
>
> Regards
> Marcin
>


Re: [ hbase ] Re: hbase coprocessor unit testing

2012-04-17 Thread Marcin Cylke
On 16/04/12 16:49, Alex Baranau wrote:
> Here's some code that worked for me [1]. You may also find useful to look
> at the pom's dependencies [2].

Thanks, Your cluster initialization is certainly more elegant than what
I had. However it still gives me the same error as I reported. Moreover,
I've cloned the repository You linked to (branch CP) and tried running
tests for that, and am also getting the same error.

Do those test pass for you?

Regards
Marcin



Re: Connect to HBase out of spring-osgi environment

2012-04-17 Thread David Pocivalnik
Thanks for the reply. There is no Hbase support yet.

I managed to create an osgi bundle for hbase and the corresponding hadoop jar 
with maven, just had to add "resolution:=optional" to the import-packages.

Hope that helps others too.

cheers



Re: Storing extremely large size file

2012-04-17 Thread Brock Noland
Hi,

Any reason you cannot have an exception marker for large files and
then store them directly in HDFS?

Brock

On Tue, Apr 17, 2012 at 1:06 PM, vishnupriyaa  wrote:
>
> I want to save a file of size 12MB but an exception occuring like this
> KeyValue size too large.
> I have set the value of hbase.client.keyvalue.maxsize in hbase-site.xml and
> hbase-default.xml to 3GB
> but the default value 10MB is taking for hbase.client.keyvalue.maxsize.How
> could I change the value of hbase.client.keyvalue.maxsize or how to store
> the file of extremely large size.
> --
> View this message in context: 
> http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
> Sent from the HBase User mailing list archive at Nabble.com.
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/


Re: hbase coprocessor unit testing

2012-04-17 Thread Marcin Cylke
On 17/04/12 15:15, Alex Baranau wrote:

Hi

> Some sanity checks:
> 1) make sure you don't have 127.0.1.1 in your /etc/hosts (only 127.0.0.1)

I've removed this entry and it worked right away :) Could You explain
why it did so big difference?

Now the test from HBaseHUT works fine, but mine code is still failing:

#v+
2012-04-17 15:26:27,870 [localhost:)] WARN  ClientCnxn
  :1063 - Session 0x0 for server null, unexpected error, closing
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1041)
2012-04-17 15:26:27,871 [dler 2 on 35003] INFO  RecoverableZooKeeper
  :89 - The identifier of this process is 2032@correspondence
2012-04-17 15:26:27,973 [dler 2 on 35003] WARN  RecoverableZooKeeper
  :159 - Possibly transient ZooKeeper exception:
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/master
2012-04-17 15:26:27,974 [dler 2 on 35003] INFO  RetryCounter
  :53 - The 1 times to retry  after sleeping 2000 ms
2012-04-17 15:26:28,973 [localhost:)] INFO  ClientCnxn
  :933 - Opening socket connection to server localhost/127.0.0.1:
#v-

My whole test is something like this:

#v+
 testingUtility.getConfiguration().setStrings(
   CoprocessorHost.USER_REGION_COPROCESSOR_CONF_KEY,
   AuxDataCalculator.class.getName());
 testingUtility.startMiniCluster();

byte[] TABLE = Bytes.toBytes(getClass().getName());
byte[] A = Bytes.toBytes("A");
byte[] STATS = Bytes.toBytes("stats");
byte[] CONTENT = Bytes.toBytes("content");
byte[][] FAMILIES = new byte[][] { A, STATS, CONTENT } ;

HTable hTable = testingUtility.createTable(TABLE, FAMILIES);
Put put = new Put(ROW);
put.add(A, A, A);

hTable.put(put);

Get get = new Get(ROW);
Result result = hTable.get(get);
#v-


As I don't see any particular differences between Your unit test and
mine, could You look into this a bit more?

Regards
Marcin


Re: [ hbase ] Re: hbase coprocessor unit testing

2012-04-17 Thread Alex Baranau
Just tried to do a clean clone, then

$ mvn -Dtest=TestHBaseHutCps test

went well [1].

How long does it take for test to fail when you run it?

Some sanity checks:
1) make sure you don't have 127.0.1.1 in your /etc/hosts (only 127.0.0.1)
2) make sure there are no hbase/hadoop processes running on your machine
(sudo jps)
3) cleanup your /tmp dir

I see "java.net.ConnectException: Connection refused", which may indicate
some of your cluster parts failed to start. Bigger log should be more
helpful.

Alex Baranau
--
Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase

[1]
got this for sure:

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[...]
[INFO]

[INFO] BUILD SUCCESS


On Tue, Apr 17, 2012 at 5:01 AM, Marcin Cylke  wrote:

> On 16/04/12 16:49, Alex Baranau wrote:
> > Here's some code that worked for me [1]. You may also find useful to look
> > at the pom's dependencies [2].
>
> Thanks, Your cluster initialization is certainly more elegant than what
> I had. However it still gives me the same error as I reported. Moreover,
> I've cloned the repository You linked to (branch CP) and tried running
> tests for that, and am also getting the same error.
>
> Do those test pass for you?
>
> Regards
> Marcin
>
>


Storing extremely large size file

2012-04-17 Thread vishnupriyaa

I want to save a file of size 12MB but an exception occuring like this
KeyValue size too large.
I have set the value of hbase.client.keyvalue.maxsize in hbase-site.xml and
hbase-default.xml to 3GB
but the default value 10MB is taking for hbase.client.keyvalue.maxsize.How
could I change the value of hbase.client.keyvalue.maxsize or how to store
the file of extremely large size.
-- 
View this message in context: 
http://old.nabble.com/Storing-extremely-large-size-file-tp33701522p33701522.html
Sent from the HBase User mailing list archive at Nabble.com.



Re: HBase Security Configuration

2012-04-17 Thread Harsh J
Hey Konrad,

Make sure your HBase's classpath also has the Hadoop conf dir on it
(specifically hdfs-site.xml and core-site.xml). It it already does
have that, make sure they are populated with the right HDFS cluster
values (core-site needs two properties that toggle security ON, and
hdfs-site needs the HDFS server principals configured inside it -
basically just copy these core-site and hdfs-site files from your
secured HDFS cluster config over to the HBase machines/classpath).

On Tue, Apr 17, 2012 at 5:38 PM, Konrad Tendera  wrote:
> Hello,
> I'm trying to configure secure HBase using following instruction: 
> https://ccp.cloudera.com/display/CDHDOC/HBase+Security+Configuration. Our 
> cluster uses Kerberos and everything in Hadoop work fine. But when I start 
> HBase following exception is thrown
>
> FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting 
> shutdown.
> org.apache.hadoop.security.AccessControlException: Authentication is required
>        at org.apache.hadoop.ipc.Client.call(Client.java:1028)
>        at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
>        at $Proxy9.getProtocolVersion(Unknown Source)
>        at 
> org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:249)
>        at 
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:161)
>        at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:278)
>        at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:245)
>        at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:109)
>        at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1792)
>        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:76)
>        at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1826)
>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1808)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:265)
>        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:189)
>        at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:471)
>        at 
> org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:94)
>        at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:448)
>        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
>        at java.lang.Thread.run(Thread.java:662)
>
> I can't find any info about it. I'm using Hbase 0.92 with Hadoop 0.22
>
> --
> Konrad Tendera



-- 
Harsh J


Re: Hbase Map/reduce-How to access individual columns of the table?

2012-04-17 Thread Doug Meil

Hi there-

Have you seen the chapter on MR in the RefGuide?

http://hbase.apache.org/book.html#mapreduce.example

You use the Result instance just like you would from a client program.




On 4/17/12 6:07 AM, "Ram"  wrote:

> have a table called User with two columns ,one called visitorId and the
>other 
>called friend which is a list of strings. I want to check whether the
>VisitorId 
>is in the friendlist. Can anyone direct me as to how to access the table
>columns 
>in a map function? I'm not able to picture how data is output from a map
>function in hbase. My code is as follows.
>
>ublic class MapReduce {
>
>static class Mapper1 extends TableMapper {
>
>private int numRecords = 0;
>private static final IntWritable one = new IntWritable(1);
>
>
>private final IntWritable ONE = new IntWritable(1);
>private Text text = new Text();
>@Override
>public void map(ImmutableBytesWritable row, Result values, Context
>context) 
>throws IOException {
>
>//What should i do here??how do i access the individual columns
>and   
>compare?
>ImmutableBytesWritable userKey = new
>ImmutableBytesWritable(row.get(),
>0, Bytes.SIZEOF_INT);
>
>   context.write(userkey,One);
> }
>
>//context.write(text, ONE);
>} catch (InterruptedException e) {
>throw new IOException(e);
>}
>
>}
>}
>
>
>
>public static void main(String[] args) throws Exception {
>Configuration conf = HBaseConfiguration.create();
>Job job = new Job(conf, "CheckVisitor");
>job.setJarByClass(MapReduce.class);
>Scan scan = new Scan();
>Filter f = new RowFilter(CompareOp.EQUAL,new
>SubstringComparator("mId2"));
>scan.setFilter(f);
>scan.addFamily(Bytes.toBytes("visitor"));
>scan.addFamily(Bytes.toBytes("friend"));
>TableMapReduceUtil.initTableMapperJob("User", scan, Mapper1.class,
>ImmutableBytesWritable.class,Text.class, job);
>
>}
>
>




HBase Security Configuration

2012-04-17 Thread Konrad Tendera
Hello,
I'm trying to configure secure HBase using following instruction: 
https://ccp.cloudera.com/display/CDHDOC/HBase+Security+Configuration. Our 
cluster uses Kerberos and everything in Hadoop work fine. But when I start 
HBase following exception is thrown 

FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting 
shutdown.
org.apache.hadoop.security.AccessControlException: Authentication is required
at org.apache.hadoop.ipc.Client.call(Client.java:1028)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy9.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:249)
at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:161)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:278)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:245)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:109)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1792)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:76)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1826)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1808)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:265)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:189)
at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:471)
at 
org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:94)
at 
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:448)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
at java.lang.Thread.run(Thread.java:662)

I can't find any info about it. I'm using Hbase 0.92 with Hadoop 0.22

-- 
Konrad Tendera


hfile writer append key value offset

2012-04-17 Thread Juan Pino
Hi,

I am using the code below to append key/value pairs from a SequenceFile to
an HFile.
The problem is that I don't include offset and length information.
In the old api (0.20.2), there was a method
*append
*(byte[] key, int koffset, int klength, byte[] value, int voffset,
int vlength)
so I would replace line 7. below by hfileWriter.append(key.getBytes(), 0,
key.getLength(), valueBytes, 0, valueBytes.length);
How do I do this with the newer api (0.92) ? Thank you very much.

Best regards,

Juan

1. HFile.WriterFactory hfileWriterFactory = HFile.getWriterFactory(conf);
2. HFile.Writer hfileWriter = hfileWriterFactory.createWriter(fs, path, 64
* 1024, "gz", null);
3. BytesWritable key = new BytesWritable();
4. ArrayWritable value = new ArrayWritable(IntWritable.class);
5. while (sequenceReader.next(key, value)) {
6. byte[] valueBytes = Util.object2ByteArray(value);
7. hfileWriter.append(key.getBytes(), valueBytes);
8. }


Hbase Map/reduce-How to access individual columns of the table?

2012-04-17 Thread Ram
 have a table called User with two columns ,one called visitorId and the other 
called friend which is a list of strings. I want to check whether the VisitorId 
is in the friendlist. Can anyone direct me as to how to access the table 
columns 
in a map function? I'm not able to picture how data is output from a map 
function in hbase. My code is as follows.

ublic class MapReduce {

static class Mapper1 extends TableMapper {

private int numRecords = 0;
private static final IntWritable one = new IntWritable(1);


private final IntWritable ONE = new IntWritable(1);
private Text text = new Text();
@Override
public void map(ImmutableBytesWritable row, Result values, Context context) 
throws IOException {

//What should i do here??how do i access the individual columns and 
  
compare?
ImmutableBytesWritable userKey = new ImmutableBytesWritable(row.get(), 
0, Bytes.SIZEOF_INT);

   context.write(userkey,One);  
 }

//context.write(text, ONE);
} catch (InterruptedException e) {
throw new IOException(e);
}

}
}



public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf, "CheckVisitor");
job.setJarByClass(MapReduce.class);
Scan scan = new Scan();
Filter f = new RowFilter(CompareOp.EQUAL,new SubstringComparator("mId2"));
scan.setFilter(f);
scan.addFamily(Bytes.toBytes("visitor"));
scan.addFamily(Bytes.toBytes("friend"));
TableMapReduceUtil.initTableMapperJob("User", scan, Mapper1.class, 
ImmutableBytesWritable.class,Text.class, job);

}



Re: Duplicate an HBase cluster

2012-04-17 Thread Manuel de Ferran
On Tue, Apr 17, 2012 at 11:03 AM, Harsh J  wrote:

> Manuel,
>
> You can also just start the second HDFS cluster in parallel, and do an
> "hadoop fs -cp hdfs://original-nn/hbase hdfs://new-nn/hbase" (or a
> distcp) and then start HBase services on the new cluster (make sure zk
> quorum is separate or has a different hbase-rootdir though).
>
>
Thanks for the replies.

The goal is to minimize the offline time, and the copy time would be longer
than the time needed by the HDFS replication to catch a couple of new
blocks.

Our proposal is similar to an incremental backup.


Re: Duplicate an HBase cluster

2012-04-17 Thread Harsh J
Manuel,

You can also just start the second HDFS cluster in parallel, and do an
"hadoop fs -cp hdfs://original-nn/hbase hdfs://new-nn/hbase" (or a
distcp) and then start HBase services on the new cluster (make sure zk
quorum is separate or has a different hbase-rootdir though).

On Tue, Apr 17, 2012 at 2:22 PM, Manuel de Ferran
 wrote:
> Greetings,
>
> we have a 4 nodes cluster running HBase-0.90.3 over Hadoop-0.20-append.
> We'd like to create another HBase cluster from this one with minimal HBase
> downtime. We have plenty of disk space on each datanode.
>
> Here is what we have in mind:
> - Add a new datanode aka. DN5
> - Raise HDFS replication factor to 5 to have a whole copy on each datanode
> - Wait until replication done
>  - Disable all tables
> - Stop DN5
> - Copy Namenode data (dfs/name/current ...) to DN5
> - Enable all tables
> - Start a new namenode on DN5 aka NN2
> - Reconfigure DN5 to point to NN2
> - Configure a new HBase cluster on top of the new HDFS
>
> It works on a small cluster but is it enough to have a consistent copy ?
>
> Any hints ? Is there a best-practice ?
>
> Thanks



-- 
Harsh J


Re: [ hbase ] Re: hbase coprocessor unit testing

2012-04-17 Thread Marcin Cylke
On 16/04/12 16:49, Alex Baranau wrote:
> Here's some code that worked for me [1]. You may also find useful to look
> at the pom's dependencies [2].

Thanks, Your cluster initialization is certainly more elegant than what
I had. However it still gives me the same error as I reported. Moreover,
I've cloned the repository You linked to (branch CP) and tried running
tests for that, and am also getting the same error.

Do those test pass for you?

Regards
Marcin



RE: Duplicate an HBase cluster

2012-04-17 Thread Srikanth P. Shreenivas
Did you consider using HBase Backup/Restore?  We had in the past moved data 
from one cluster to another using this technique.

http://hbase.apache.org/book/ops.backup.html


regards,
Srikanth

-Original Message-
From: Manuel de Ferran [mailto:manuel.defer...@gmail.com]
Sent: Tuesday, April 17, 2012 2:23 PM
To: user@hbase.apache.org
Subject: Duplicate an HBase cluster

Greetings,

we have a 4 nodes cluster running HBase-0.90.3 over Hadoop-0.20-append.
We'd like to create another HBase cluster from this one with minimal HBase 
downtime. We have plenty of disk space on each datanode.

Here is what we have in mind:
- Add a new datanode aka. DN5
- Raise HDFS replication factor to 5 to have a whole copy on each datanode
- Wait until replication done
 - Disable all tables
- Stop DN5
- Copy Namenode data (dfs/name/current ...) to DN5
- Enable all tables
- Start a new namenode on DN5 aka NN2
- Reconfigure DN5 to point to NN2
- Configure a new HBase cluster on top of the new HDFS

It works on a small cluster but is it enough to have a consistent copy ?

Any hints ? Is there a best-practice ?

Thanks



http://www.mindtree.com/email/disclaimer.html