Re: Question about rest interface

2010-09-30 Thread Andrei Savu
The latest version of the rest gateway, now available in trunk, works
the way you want it.

I had the same problem you have while working on the code. There is
also a simple start/stop script available (src/contrib/rest/rest.sh).

You should checkout the trunk [1] or [2]. Runt ant jar in the root
folder and ant tar in src/contrib/rest. After running these steps
you will find in build/contrib/rest/ a .tar.gz archive that contains
everything you need to run a standalone REST gateway for ZooKeeper.

The config file should be pretty much self explanatory but if you need
more help let me know.

The version in the trunk is now session aware and you can use it even
to implement things like leader election (you can find some python
examples in  src/contrib/rest/src/python).

I'm planning to add more features to it, things like ACLs and session
authentication but unfortunately I haven't got the time. I should be
able to do this in the near future.

[1] http://hadoop.apache.org/zookeeper/version_control.html
[2] http://github.com/apache/zookeeper

On Thu, Sep 30, 2010 at 7:01 PM, Patrick Hunt ph...@apache.org wrote:
 Hi Marc, you should checkout the REST interface that's on the svn trunk, it
 includes new functionality and numerous fixes that might be interesting to
 you, this will be part of 3.4.0. CCing Andrei who worked on this as part of
 his GSOC project this summer.
 If you look at this file:
 src/contrib/rest/src/java/org/apache/zookeeper/server/jersey/RestMain.java
 you'll see how we start the server. Looks like we need an option to run as a
 process w/o assuming interactive use. It should be pretty easy for someone
 to patch this (if you do please consider submitting a patch via our JIRA
 process, others would find it interesting). With the current code you might
 get away with something like java   /dev/null -- basically turn off
 stdin.
 Patrick
 On Wed, Sep 29, 2010 at 3:09 PM, marc slayton gangofn...@yahoo.com wrote:

 Hey all --

 Having a great time with Zookeeper and recently started testing
 the RESTful interface in src/contrib.

 'ant runrestserver' creates a test instance attached to stdin
 which works well but any input kills it. How does one configure
 Jersey to run for real i.e. not attached to my terminal's
 stdin?

 I've tried altering log4j settings without much luck.

 If there are example setup docs for Linux, could somebody point
 me there? FWIW, I'm running zookeeper-3.3.1 with openjdk-1.6.

 Cheers, and thanks in advance --







-- 
Andrei Savu -- http://www.andreisavu.ro/


Re: ZK monitoring

2010-08-17 Thread Andrei Savu
It's not possible. You need to query all the servers in order to know
who is the current leader.

It should be pretty simple to implement this by parsing the output
from the 'stat' 4-letter command.

On Tue, Aug 17, 2010 at 9:50 PM, Jun Rao jun...@gmail.com wrote:
 Hi,

 Is there a way to see the current leader and a list of followers from a
 single node in the ZK quorum? It seems that ZK monitoring (JMX, 4-letter
 commands) only provides info local to a node.

 Thanks,

 Jun




-- Andrei Savu


Re: ZK monitoring

2010-08-17 Thread Andrei Savu
You should also take a look at ZOOKEEPER-744 [1] and ZOOKEEPER-799 [2]

The archive from 799 contains ready to be used scripts for monitoring
ZooKeeper using Ganglia, Nagios and Cacti.

Let me know if you need more help.

[1] https://issues.apache.org/jira/browse/ZOOKEEPER-744
[2] https://issues.apache.org/jira/browse/ZOOKEEPER-799

On Tue, Aug 17, 2010 at 9:50 PM, Jun Rao jun...@gmail.com wrote:
 Hi,

 Is there a way to see the current leader and a list of followers from a
 single node in the ZK quorum? It seems that ZK monitoring (JMX, 4-letter
 commands) only provides info local to a node.

 Thanks,

 Jun




-- Andrei Savu


Re: building client tools

2010-07-13 Thread Andrei Savu
Hi,

In this case I think you have to install libcppunit (should work using
apt-get). I believe that should be enough but I don't really remember
what else I've installed the first time I compiled the c client.

Let me know what else was needed. I would like to submit a patch to
update the README file in order to avoid this problem in the future.

Thanks.

On Tue, Jul 13, 2010 at 8:09 PM, Martin Waite waite@gmail.com wrote:
 Hi,

 I am trying to build the c client on debian lenny for zookeeper 3.3.1.

 autoreconf -if
 configure.ac:33: warning: macro `AM_PATH_CPPUNIT' not found in library
 configure.ac:33: warning: macro `AM_PATH_CPPUNIT' not found in library
 configure.ac:33: error: possibly undefined macro: AM_PATH_CPPUNIT
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.
 autoreconf: /usr/bin/autoconf failed with exit status: 1

 I probably need to install some required tools.   Is there a list of what
 tools are needed to build this please ?

 regards,
 Martin




-- 
Andrei Savu - http://andreisavu.ro/


Re: Starting zookeeper in replicated mode

2010-06-21 Thread Andrei Savu
As Luka Stojanovic suggested you need to a a file called
/var/zookeeper/myid on each node:

$ echo 1,2 ... 6  /var/zookeeper/myid

I want to make a few more comments related to your setup and to your questions:

- there is no configured master node in a zookeeper cluster. the
leader is automatically elected at runtime
- you can write and read from any node at any time

 Am I supposed to have an instance of ZooKeeper on each node started before 
 running in replication mode?
- you start the cluster by starting one node at a time

 Should I have each node that will be running ZK listed in the config file?
- yes. you need to have all nodes running ZK listed in the config file.

 Should I be using an IP address to point to a server instead of a hostname?
- it doesn't really make difference if you use hostnames or IP addresses.

I hope this will help you.

Andrei

On Mon, Jun 21, 2010 at 10:04 PM, Erik Test erik.shi...@gmail.com wrote:
 Hi All,

 I'm having a problem with installing zookeeper on a cluster with 6 nodes in
 replicated mode. I was able to install and run zookeeper in standalone mode
 but I'm unable to run zookeeper in replicated mode.

 I've added a list of servers in zoo.cfg as suggested by the ZooKeeper
 Getting Started Guide but I get these logs displayed to screen:

 *[r...@master1 bin]# ./zkServer.sh start
 JMX enabled by default
 Using config: /root/zookeeper-3.2.2/bin/../conf/zoo.cfg
 Starting zookeeper ...
 STARTED
 [r...@master1 bin]# 2010-06-21 12:25:23,738 - INFO
 [main:quorumpeercon...@80] - Reading configuration from:
 /root/zookeeper-3.2.2/bin/../conf/zoo.cfg
 2010-06-21 12:25:23,743 - INFO  [main:quorumpeercon...@232] - Defaulting to
 majority quorums
 2010-06-21 12:25:23,745 - FATAL [main:quorumpeerm...@82] - Invalid config,
 exiting abnormally
 org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error
 processing /root/zookeeper-3.2.2/bin/../conf/zoo.cfg
        at
 org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:100)
        at
 org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:98)
        at
 org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:75)
 Caused by: java.lang.IllegalArgumentException: /var/zookeeper/myid file is
 missing
        at
 org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:238)
        at
 org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:96)
        ... 2 more
 Invalid config, exiting abnormally*

 And here is my config file:
 *
 # The number of milliseconds of each tick
 tickTime=2000
 # The number of ticks that the initial
 # synchronization phase can take
 initLimit=5
 # The number of ticks that can pass between
 # sending a request and getting an acknowledgement
 syncLimit=2
 # the directory where the snapshot is stored.
 dataDir=/var/zookeeper
 # the port at which the clients will connect
 clientPort=2181
 server.1=master1:2888:3888
 server.2=slave2:2888:3888
 server.3=slave3:2888:3888
 *
 I'm a little confused as to why this doesn't work and I haven't had any luck
 finding answers to some questions I have.

 Am I supposed to have an instance of ZooKeeper on each node started before
 running in replication mode? Should I have each node that will be running ZK
 listed in the config file? Should I be using an IP address to point to a
 server instead of a hostname?

 Thanks for your time.
 Erik




-- 
Andrei Savu

http://www.andreisavu.ro/


GSoC 2010: ZooKeeper Monitoring Recipes and Web-based Administrative Interface

2010-05-13 Thread Andrei Savu
Hi all,

My name is Andrei Savu and I am on of the GSoC2010 accepted students.
My mentor is Patrick Hunt.

My objective in the next 4 months is to write tools and recipes for
monitoring ZooKeeper and to implement a web-based administrative
interface.

I have created a wiki page for this project:
 - http://wiki.apache.org/hadoop/ZooKeeper/GSoCMonitoringAndWebInterface

Are there any HBase  / Hadoop  specific ZooKeeper monitoring requirements?

Regards

-- 
Savu Andrei

Website: http://www.andreisavu.ro/


Sample Application: Feed Aggregator

2009-09-25 Thread Andrei Savu
Hi,

I have just finished the first version of a small python / thrift demo
application: a basic feed aggregator.I want to share this with you
because I believe this could be useful for a beginner (I have detailed
install instructions). Someone new to Hbase should be able to
understand how to build an index table.

You can find the source code on github:
http://github.com/andreisavu/feedaggregator

Thank you for your attention. I would highly appreciate your feedback.

-- 
Savu Andrei

Website: http://www.andreisavu.ro/


unable to start hbase 0.20. zookeeper server not found.

2009-08-28 Thread Andrei Savu
Hi,

I have downloaded the release candidate from here: http://su.pr/1NHIlM
and I am unable to make it start standalone. It seems like the
zookeeper server does not start.

2009-08-28 10:43:49,872 INFO org.apache.zookeeper.ZooKeeper:
Initiating client connection, host=localhost:2181 sessionTimeout=6
watcher=Thread[Thread-0,5,main]
2009-08-28 10:43:49,876 INFO org.apache.zookeeper.ClientCnxn:
zookeeper.disableAutoWatchReset is false
2009-08-28 10:43:49,911 INFO org.apache.zookeeper.ClientCnxn:
Attempting connection to server localhost/127.0.0.1:2181
2009-08-28 10:43:49,926 WARN org.apache.zookeeper.ClientCnxn:
Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@7d2452e8
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:885)
2009-08-28 10:43:49,933 WARN org.apache.zookeeper.ClientCnxn: Ignoring
exception during shutdown input

The zookeeper server should be installed as a standalone application?

I'm running  bin/start-hbase.sh . On the same machine hbase 0.19.3 works fine.

Sorry if this is a silly question :)

-- 
Savu Andrei

Website: http://www.andreisavu.ro/


Re: unable to start hbase 0.20. zookeeper server not found.

2009-08-28 Thread Andrei Savu
While trying to write a response I found the solution :)

It seems like the os env is not what I expected it to be when running
a command over ssh.

This tutorial helped me understand why JAVA_HOME is not set and how to fix it.
http://www.netexpertise.eu/en/ssh/environment-variables-and-ssh.html

Thanks for your time.

On Fri, Aug 28, 2009 at 5:06 PM, Jean-Daniel Cryansjdcry...@apache.org wrote:
 What's in the Zookeeper log? It's kept with the other HBase logs.

 J-D

 On Fri, Aug 28, 2009 at 3:59 AM, Andrei Savusavu.and...@gmail.com wrote:
 Hi,

 I have downloaded the release candidate from here: http://su.pr/1NHIlM
 and I am unable to make it start standalone. It seems like the
 zookeeper server does not start.

 2009-08-28 10:43:49,872 INFO org.apache.zookeeper.ZooKeeper:
 Initiating client connection, host=localhost:2181 sessionTimeout=6
 watcher=Thread[Thread-0,5,main]
 2009-08-28 10:43:49,876 INFO org.apache.zookeeper.ClientCnxn:
 zookeeper.disableAutoWatchReset is false
 2009-08-28 10:43:49,911 INFO org.apache.zookeeper.ClientCnxn:
 Attempting connection to server localhost/127.0.0.1:2181
 2009-08-28 10:43:49,926 WARN org.apache.zookeeper.ClientCnxn:
 Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@7d2452e8
 java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:885)
 2009-08-28 10:43:49,933 WARN org.apache.zookeeper.ClientCnxn: Ignoring
 exception during shutdown input

 The zookeeper server should be installed as a standalone application?

 I'm running  bin/start-hbase.sh . On the same machine hbase 0.19.3 works 
 fine.

 Sorry if this is a silly question :)

 --
 Savu Andrei

 Website: http://www.andreisavu.ro/





-- 
Savu Andrei

Website: http://www.andreisavu.ro/


Re: hbase/jython outdated

2009-08-27 Thread Andrei Savu
See comments bellow.

On Thu, Aug 27, 2009 at 7:58 PM, stackst...@duboce.net wrote:
 On Wed, Aug 26, 2009 at 3:29 AM, Andrei Savu savu.and...@gmail.com wrote:

 I have fixed the code samples and opened a feature request on JIRA for
 the jython command.

 https://issues.apache.org/jira/browse/HBASE-1796


 Thanks.  Patch looks good.  Will commit soon.   Did you update the jython
 wiki page?  It seems to be using old API still.

I have updated the Jython wiki page to use the latest API. After the
commit I will also
update the instruction for running the sample code.



 Is there any python library for REST interface? How stable is the REST
 interface?


 Not that I know of (a ruby one, yes IIRC).  Write against stargate if you
 are going to do one since o.a.h.h.rest is deprecated in 0.20.0.


I am going give it a try and post the results back here.

What about thrift? It's going to be deprecated?

 St.Ack


-- 
Savu Andrei

Website: http://www.andreisavu.ro/


Re: hbase/jython outdated

2009-08-26 Thread Andrei Savu
I have fixed the code samples and opened a feature request on JIRA for
the jython command.

https://issues.apache.org/jira/browse/HBASE-1796

Until recently I have used the python thrift interface but it has some
serious issues with unicode.
Currently I am searching for alternatives.

Is there any python library for REST interface? How stable is the REST
interface?

On Tue, Aug 25, 2009 at 4:18 PM, Jean-Daniel Cryansjdcry...@apache.org wrote:
 I can edit this page just fine but you have to be logged in to do
 that, anyone can sign in.

 Thx!

 J-D

 On Tue, Aug 25, 2009 at 7:02 AM, Andrei Savusavu.and...@gmail.com wrote:
 Hi,

 The Hbase/Jython ( http://wiki.apache.org/hadoop/Hbase/Jython ) wiki
 page is outdated.
 I want to edit it but the page is marked as immutable.

 I have attached a working sample and a patched version of bin/hbase
 with the jython command added.

 --
 Savu Andrei

 Website: http://www.andreisavu.ro/





-- 
Savu Andrei

Website: http://www.andreisavu.ro/


hbase/jython outdated

2009-08-25 Thread Andrei Savu
Hi,

The Hbase/Jython ( http://wiki.apache.org/hadoop/Hbase/Jython ) wiki
page is outdated.
I want to edit it but the page is marked as immutable.

I have attached a working sample and a patched version of bin/hbase
with the jython command added.

-- 
Savu Andrei

Website: http://www.andreisavu.ro/

import java.lang
from org.apache.hadoop.hbase import HBaseConfiguration, HTableDescriptor, HColumnDescriptor, HConstants
from org.apache.hadoop.hbase.client import HBaseAdmin, HTable
from org.apache.hadoop.hbase.io import BatchUpdate, Cell, RowResult

# First get a conf object.  This will read in the configuration 
# that is out in your hbase-*.xml files such as location of the
# hbase master node.
conf = HBaseConfiguration()

# Create a table named 'test' that has two column families,
# one named 'content, and the other 'anchor'.  The colons
# are required for column family names.
tablename = test  

desc = HTableDescriptor(tablename)
desc.addFamily(HColumnDescriptor(content:))
desc.addFamily(HColumnDescriptor(anchor:))
admin = HBaseAdmin(conf)

# Drop and recreate if it exists
if admin.tableExists(tablename):
admin.disableTable(tablename)
admin.deleteTable(tablename)
admin.createTable(desc)

tables = admin.listTables()
table = HTable(conf, tablename)

# Add content to 'column:' on a row named 'row_x'
row = 'row_x'
update = BatchUpdate(row)
update.put('content:', 'some content')
table.commit(update)

# Now fetch the content just added, returns a byte[]
data_row = table.get(row, content:)
data = java.lang.String(data.value, UTF8)

print The fetched row contains the value '%s' % data

# Delete the table.
admin.disableTable(desc.getName())
admin.deleteTable(desc.getName())



Re: Feed Aggregator Schema

2009-08-17 Thread Andrei Savu
Thanks for your answer Peter.

I will give it a try using this approach and I will let you know how it works.

On Mon, Aug 17, 2009 at 10:26 AM, Peter
Rietzlerpeter.rietz...@smarter-ecommerce.com wrote:

 Hi

 In our project we are handling event lists where we have similar
 requirements. We do ordering by choosing our row keys wisely. We use the
 following key for our events (they should be ordered by time in ascending
 order):

 eventListName/MMddHHmmssSSS-000[-111]

 where eventListName is the name of the event list and 000 is a three digit
 instance id to disambiguate between different running instances of
 application, and -111 is optional to disambiguate events that occured in the
 same millisecond on one instance.

 We additionally insert and artifical row for each day with the id

 eventListName/MMddHHmmssSSS

 This allows us to start scanning at the beginning of each day without
 searching through the event list.

 You need to be aware of the fact that if you have a very high load of
 inserts, then always one hbase region server is busy inserting while the
 others are idle ... if that's a problem for you, you have to find different
 keys for your purpose.

 You could also use an HBase index table but I have no experience with it and
 I remember an email on the mailing list that this would double all requests
 because the API would first lookup the index table and then the original
 table ??? (please correct me if this is not right ...)

 Kind regards,
 Peter



 Andrei Savu wrote:

 Hello,

 I am working on a project involving monitoring a large number of
 rss/atom feeds. I want to use hbase for data storage and I have some
 problems designing the schema. For the first iteration I want to be
 able to generate an aggregated feed (last 100 posts from all feeds in
 reverse chronological order).

 Currently I am using two tables:

 Feeds: column families Content and Meta : raw feed stored in Content:raw
 Urls: column families Content and Meta : raw post version stored in
 Content:raw and the rest of the data found in RSS stored in Meta

 I need some sort of index table for the aggregated feed. How should I
 build that? Is hbase a good choice for this kind of application?

 In other words: Is it possible( in hbase) to design a schema that
 could efficiently answer to queries like the one listed bellow?

 SELECT data FROM Urls ORDER BY date DESC LIMIT 100

 Thanks.

 --
 Savu Andrei

 Website: http://www.andreisavu.ro/



 --
 View this message in context: 
 http://www.nabble.com/Feed-Aggregator-Schema-tp24974071p25002264.html
 Sent from the HBase User mailing list archive at Nabble.com.





-- 
Savu Andrei

Website: http://www.andreisavu.ro/


Feed Aggregator Schema

2009-08-14 Thread Andrei Savu
Hello,

I am working on a project involving monitoring a large number of
rss/atom feeds. I want to use hbase for data storage and I have some
problems designing the schema. For the first iteration I want to be
able to generate an aggregated feed (last 100 posts from all feeds in
reverse chronological order).

Currently I am using two tables:

Feeds: column families Content and Meta : raw feed stored in Content:raw
Urls: column families Content and Meta : raw post version stored in
Content:raw and the rest of the data found in RSS stored in Meta

I need some sort of index table for the aggregated feed. How should I
build that? Is hbase a good choice for this kind of application?

In other words: Is it possible( in hbase) to design a schema that
could efficiently answer to queries like the one listed bellow?

SELECT data FROM Urls ORDER BY date DESC LIMIT 100

Thanks.

--
Savu Andrei

Website: http://www.andreisavu.ro/