Hello, I'm looking to move our distributed services (currently using Camel's JMS remoting) to a dOSGi solution. I won't be looking at ECF (simply because they don't seem to publish up-to-date maven artifacts), so that leaves Fuse Fabric and Karaf Cellar.
Apologies for the length of this mail, what follows is a log of the steps I took with Fabric, in the hope that somebody can spot something. Thought I'd try to get something working in FuseESB before porting in back to Karaf 2.2.x. So far I've burnt a bit of time with Fabric but not made any progress (tried trunk and fuse-fabric-7.0.0.fuse-061, currently looking at fuse-fabric-7.0.1.fuse-084). I suspect I'm doing something stupid or have missed something vital out. Starting with the simplest distributed case I've got a features.xml and 3 simple bundles; a service interface bundle, a client bundle and a service implementation bundle. I've ensured the client and service work in a single instance, but am a bit stumped as to how to get this running over fabric. The service imp exports the property "service.exported.interfaces" as "*" and the interface implied by the asterisk is available to both service imp and client bundles (i.e. deployed to both instances). On both boxes I've temporarily instructed iptables to open all ports and I'm running the fusefabric command as root. # On box1 (provider): tar -xzf /data/share/fuse-fabric-7.0.1.fuse-084.tar.gz cd fuse-fabric-7.0.1.fuse-084/ sudo ./bin/fusefabric # On box1's Karaf shell: log:set TRACE net.earcam log:set TRACE org.fusesource.fabric.dosgi features:install fabric-commands fabric-dosgi fabric-configadmin fabric-zookeeper-commands # Line above logs exception as Zookeeper not created or ensembled - assuming safe to ignore fabric:create --clean root # Causes exception 1 (see below), again assuming this is safe to ignore fabric:profile-create --parents dosgi dosgi-provider fabric:profile-edit --repositories mvn:net.earcam.example.hello/net.earcam.example.hello.feature/0.0.1-SNAPSHOT/xml/features dosgi-provider fabric:profile-edit --features hello-server dosgi-provider fabric:container-create --profile dosgi-provider --parent root dosgi-provider fabric:container-list # Shows success fabric:container-connect dosgi-provider # Connects fine, can see ..example.hello bundles installed log:set TRACE net.earcam log:set TRACE org.fusesource.fabric.dosgi # ------ # On box2 (consumer): tar -xzf /data/share/fuse-fabric-7.0.1.fuse-084.tar.gz cd fuse-fabric-7.0.1.fuse-084/ sed -i s/karaf.name=root/karaf.name=vm-mint1/ etc/system.properties sudo ./bin/fusefabric # On box2's Karaf shell: log:set TRACE net.earcam log:set TRACE org.fusesource.fabric.dosgi features:install fabric-commands fabric-dosgi fabric-configadmin fabric-zookeeper-commands fabric:join 10.39.216.49 # The IP of box1 fabric:profile-create --parents dosgi dosgi-consumer fabric:profile-edit --repositories mvn:net.earcam.example.hello/net.earcam.example.hello.feature/0.0.1-SNAPSHOT/xml/features dosgi-consumer fabric:profile-edit --features hello-client dosgi-consumer fabric:container-create --profile dosgi-consumer --parent vm-mint1 dosgi-consumer fabric:container-list # Shows success fabric:container-connect dosgi-consumer # Connects fine, can see ..example.hello bundles installed log:set TRACE net.earcam log:set TRACE org.fusesource.fabric.dosgi # ------ The "provision status" is "success" for all when running container-list and I can connect to the two new containers dosgi-provider and dosgi-consumer. On box2, in the dosgi-consumer container, I can see the remote service exported from box1, and it disappears/appears as I stop/start the exporting service bundle on box1. The imported service has "correct looking" properties: component.id = 2 component.name = net.earcam.example.hello.server.HelloService endpoint.framework.uuid = b5806ad9-7849-4aaf-adf7-56ba1e6538e3 endpoint.id = 2008920412-47095-1348493938958-0-3 fabric.address = tcp://localhost:44462 objectClass = net.earcam.example.hello.api.GreetingService service.id = 246 service.imported = true service.imported.configs = fabric-dosgi service.pid = net.earcam.example.hello.server.HelloService # Exception 1, seen on both boxes: java.lang.NullPointerException at org.fusesource.fabric.dosgi.impl.Manager.destroy(Manager.java:175) at org.fusesource.fabric.dosgi.Activator.destroy(Activator.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)[:1.6.0_33] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)[:1.6.0_33] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)[:1.6.0_33] at java.lang.reflect.Method.invoke(Method.java:597)[:1.6.0_33] at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:225)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:838)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BeanRecipe.destroy(BeanRecipe.java:743)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BlueprintRepository.destroy(BlueprintRepository.java:295)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:673)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:826)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BlueprintExtender.destroyContext(BlueprintExtender.java:255)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BlueprintExtender.bundleChanged(BlueprintExtender.java:247)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.apache.aries.blueprint.container.BlueprintExtender$BlueprintBundleTrackerCustomizer.modifiedBundle(BlueprintExtender.java:471)[10:org.apache.aries.blueprint:0.3.1.fuse-70-084] at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:495)[karaf.jar:2.2.5.fuse-70-084] at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:1)[karaf.jar:2.2.5.fuse-70-084] at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:238)[karaf.jar:2.2.5.fuse-70-084] at org.osgi.util.tracker.BundleTracker$Tracked.bundleChanged(BundleTracker.java:457)[karaf.jar:2.2.5.fuse-70-084] at org.apache.felix.framework.util.EventDispatcher.invokeBundleListenerCallback(EventDispatcher.java:870)[org.apache.felix.framework-4.0.3.fuse-70-084.jar:] at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:791)[org.apache.felix.framework-4.0.3.fuse-70-084.jar:] at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:515)[org.apache.felix.framework-4.0.3.fuse-70-084.jar:] at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4431)[org.apache.felix.framework-4.0.3.fuse-70-084.jar:] at org.apache.felix.framework.Felix.stopBundle(Felix.java:2532)[org.apache.felix.framework-4.0.3.fuse-70-084.jar:] at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:983)[org.apache.felix.framework-4.0.3.fuse-70-084.jar:] at org.fusesource.fabric.agent.DeploymentAgent.updateDeployment(DeploymentAgent.java:709)[102:org.fusesource.fabric.fabric-agent:7.0.1.fuse-084] at org.fusesource.fabric.agent.DeploymentAgent.doUpdate(DeploymentAgent.java:415)[102:org.fusesource.fabric.fabric-agent:7.0.1.fuse-084] at org.fusesource.fabric.agent.DeploymentAgent$1.run(DeploymentAgent.java:225)[102:org.fusesource.fabric.fabric-agent:7.0.1.fuse-084] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)[:1.6.0_33] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)[:1.6.0_33] at java.util.concurrent.FutureTask.run(FutureTask.java:138)[:1.6.0_33] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)[:1.6.0_33] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)[:1.6.0_33] at java.lang.Thread.run(Thread.java:662)[:1.6.0_33] # Exception 2, seen on box2: 2012-09-24 15:16:00,142 | INFO | spatch-DEFAULT-1 | TransportPool | 67 - org.fusesource.fabric.fabric-dosgi - 7.0.1.fuse-084 | Transport failure java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.7.0_07] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)[:1.7.0_07] at org.fusesource.fabric.dosgi.tcp.TcpTransport$5.run(TcpTransport.java:483)[67:org.fusesource.fabric.fabric-dosgi:7.0.1.fuse-084] at org.fusesource.hawtdispatch.internal.NioDispatchSource$3.run(NioDispatchSource.java:228)[78:org.fusesource.hawtdispatch.hawtdispatch:1.9.0] at org.fusesource.hawtdispatch.internal.SerialDispatchQueue.run(SerialDispatchQueue.java:84)[78:org.fusesource.hawtdispatch.hawtdispatch:1.9.0] at org.fusesource.hawtdispatch.internal.pool.SimpleThread.run(SimpleThread.java:77)[78:org.fusesource.hawtdispatch.hawtdispatch:1.9.0] 2012-09-24 15:16:00,147 | INFO | agent-1-thread-1 | DeploymentAgent | 63 - org.fusesource.fabric.fabric-agent - 7.0.1.fuse-084 | Done. # After which the logs are filled with: 2012-09-24 15:18:14,435 | INFO | nana.local:2181) | ClientCnxn | 54 - org.fusesource.fabric.fabric-linkedin-zookeeper - 7.0.1.fuse-084 | Opening socket connection to server banana.local/10.39.216.49:2181 2012-09-24 15:18:14,436 | INFO | nana.local:2181) | ZooKeeperSaslClient | 54 - org.fusesource.fabric.fabric-linkedin-zookeeper - 7.0.1.fuse-084 | Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration. 2012-09-24 15:18:14,437 | INFO | nana.local:2181) | ClientCnxn | 54 - org.fusesource.fabric.fabric-linkedin-zookeeper - 7.0.1.fuse-084 | Socket connection established to banana.local/10.39.216.49:2181, initiating session 2012-09-24 15:18:14,438 | INFO | nana.local:2181) | ClientCnxn | 54 - org.fusesource.fabric.fabric-linkedin-zookeeper - 7.0.1.fuse-084 | Unable to read additional data from server sessionid 0x139f880728d0005, likely server has closed socket, closing socket connection and attempting reconnect >From what I've read the SASL messages are safe to ignore, but exception 2 looks related to the problem - I don't think it's a configuration issue with the OSes, the SocketChannel fails to connect but all ports are open and I can telnet remotely to that ip on port 2181. The code I'm using can be found here https://dl.dropbox.com/u/2465717/net.earcam.example.hello.tgz (just a simple maven project). I'd be very grateful if anyone has a suggestions as to the cause, an RTFM with link, or a pointer to a working example, etc. thanks, Best regards, Caspar