In addition, Do you have any example on consuming a twitter stream and using it in SAMOA?
Thank you [cid:[email protected]] John Calvo M.Sc. B.Eng. PhD Student School of Computer Science and Engineering UNSW AUSTRALIA Building K17 Room 301-04 SYDNEY 2052 NSW Australia W: www.computing.unsw.edu.au<http://www.computing.unsw.edu.au/> FB: https://www.facebook.com/UNSW.COMPUTING TW: @UNSWCOMPUTING G+: UNSW CSE E: [email protected]<mailto:[email protected]> P: (+61) 2 9385 6916 (Internal: x56916) M: (+61) 04 5161 4230 On 29 Oct 2015, at 10:53, John Calvo Martinez <[email protected]<mailto:[email protected]>> wrote: Dear Nicolas, Thanks for your answer, Finally, I’ve sorted it out with Storm! As you mentioned, the key point is to properly configure Storm. In my case was a bit different since I installed it on a macOsx machine. I’m sending you the details of my installation if it’s helpful in some way. Best regards, <SAMOA Installation.rtf> <0.png> John Calvo M.Sc. B.Eng. PhD Student School of Computer Science and Engineering UNSW AUSTRALIA Building K17 Room 301-04 SYDNEY 2052 NSW Australia W: www.computing.unsw.edu.au<http://www.computing.unsw.edu.au/> FB: https://www.facebook.com/UNSW.COMPUTING TW: @UNSWCOMPUTING G+: UNSW CSE E: [email protected]<mailto:[email protected]> P: (+61) 2 9385 6916 (Internal: x56916) M: (+61) 04 5161 4230 On 15 Oct 2015, at 18:27, Nicolas Kourtellis <[email protected]<mailto:[email protected]>> wrote: Hi John, I don't have experience with Samza, but it could be an issue with your classpath. In any case, I have played with Samoa + Storm and it works. It can be a bit involved to set up Storm itself but once you do, it should work fine. If you want to try it out, here is a list of steps I followed and worked for me. I will upload these steps on the samoa page as well for future reference. Hope they help, Nicolas >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Installation of storm cluster: https://storm.apache.org/documentation/Setting-up-a-Storm-cluster.html Download a stable distribution from https://storm.apache.org/downloads.html e.g.: wget http://ftp.cixug.es/apache/storm/apache-storm-0.9.3/apache-storm-0.9.3.tar.gz untar the file: tar -xvf apache-storm-0.9.3.tar.gz Setup appropriately the conf/storm.yaml file within the unpacked folder. An example is the following: storm.zookeeper.servers: - "127.0.0.1" storm.local.dir: "/var/storm-logs" nimbus.host: "127.0.0.1" supervisor.slots.ports: - 6700 worker.childopts: "-Xmx2000m" supervisor.childopts: "-Xmx256m" nimbus.childopts: "-Xmx512m" Create folder ~/.storm Copy the file conf/storm.yaml into the folder ~/.storm/ Setup your $STORM_HOME to point to the folder of storm. e.g. export STORM_HOME=/homedirectory/apache-storm-0.9.3 Installation of Zookeeper: http://zookeeper.apache.org/doc/r3.3.3/zookeeperStarted.html#sc_InstallingSingleMode Download from: wget http://apache.rediris.es/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz Untar: tar -xvf zookeeper-3.4.6.tar.gz Go in the unpacked folder and create conf/zoo.cfg file and modify the dataDir directory. Start zookeeper: bin/zkServer.sh start Start nimbus process (go into the folder of bin/storm). (Better to execute this from a screen terminal so that you can detach from it after starting) ./storm nimbus Start supervisor process (go into the folder of bin/storm) (Better to execute this from a screen terminal so that you can detach from it after starting) ./storm supervisor Start the UI for storm (go into the folder of bin/storm) (Better to execute this from a screen terminal so that you can detach from it after starting) ./storm ui After you have downloaded, unpacked and mvn-ed the samoa package, you can execute the bin/samoa command with storm as the processing engine (and cross fingers!). On Thu, Oct 15, 2015 at 8:20 AM, John Calvo Martinez <[email protected]<mailto:[email protected]>> wrote: Dear Gianmarco, I hope your are well, I’m writing you because I was trying to use SAMOA on Samza, S4 and Storm but none of those worked for me. Would you help me a bit with this? The most likely to run was Samza. I did the Zookeeper and Kafka installation. Those worked well. Following the tutorial on http://samoa.incubator.apache.org/documentation/Executing-SAMOA-with-Apache-Samza.html I was trying to build the Samza maven package but it seems that the git folder is no longer available, so I decided to use and build this repo> https://github.com/apache/samza This was successfully built, but when I try to use SAMOA I got an error. First, it seems that the package was successfully built: [INFO] [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ samoa-test --- [INFO] Building jar: /usr/local/samoa-0.3.0-incubating/samoa-test/target/samoa-test-0.3.0-incubating.jar [INFO] [INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ samoa-test --- [INFO] [INFO] --- maven-jar-plugin:2.4:test-jar (default) @ samoa-test --- [INFO] Building jar: /usr/local/samoa-0.3.0-incubating/samoa-test/target/samoa-test-0.3.0-incubating-tests.jar [INFO] [INFO] --- maven-assembly-plugin:2.4.1:single (default) @ samoa-test --- [INFO] Reading assembly descriptor: src/main/assembly/test-jar-with-dependencies.xml [INFO] Building jar: /usr/local/samoa-0.3.0-incubating/samoa-test/target/samoa-test-0.3.0-incubating-test-jar-with-dependencies.jar [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Apache SAMOA ....................................... SUCCESS [ 4.328 s] [INFO] samoa-instances .................................... SUCCESS [ 1.936 s] [INFO] samoa-api .......................................... SUCCESS [ 11.857 s] [INFO] samoa-samza ........................................ SUCCESS [ 16.251 s] [INFO] samoa-test ......................................... SUCCESS [ 1.910 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 36.563 s [INFO] Finished at: 2015-10-15T17:04:38+11:00 [INFO] Final Memory: 52M/1493M [INFO] ------------------------------------------------------------------------ But when I’m trying to use it an unloaded class error occurs: $ bin/samoa samza target/SAMOA-Samza-0.3.0-SNAPSHOT.jar "PrequentialEvaluation -d /tmp/dump.csv -i 1000000 -f 100000 -l (classifiers.trees.VerticalHoeffdingTree -p 4) -s (generators.RandomTreeGenerator -c 2 -o 10 -u 10)" bin/samoa Deploying to SAMZA Error: Could not find or load main class org.apache.samoa.SamzaDoTask Kafka Server is running: [2015-10-15 16:07:51,534] INFO Initiating client connection, connectString=localhost:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@7364985f (org.apache.zookeeper.ZooKeeper) [2015-10-15 16:07:51,553] INFO Opening socket connection to server localhost/127.0.0.1:2181<http://127.0.0.1:2181/>. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [2015-10-15 16:07:51,622] INFO Socket connection established to localhost/127.0.0.1:2181<http://127.0.0.1:2181/>, initiating session (org.apache.zookeeper.ClientCnxn) [2015-10-15 16:07:51,709] INFO Session establishment complete on server localhost/127.0.0.1:2181<http://127.0.0.1:2181/>, sessionid = 0x15069dd8c1d0000, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) [2015-10-15 16:07:51,710] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient) [2015-10-15 16:07:51,795] INFO Log directory '/tmp/kafka-logs' not found, creating it. (kafka.log.LogManager) [2015-10-15 16:07:51,804] INFO Loading logs. (kafka.log.LogManager) [2015-10-15 16:07:51,809] INFO Logs loading complete. (kafka.log.LogManager) [2015-10-15 16:07:51,809] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager) [2015-10-15 16:07:51,813] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager) [2015-10-15 16:07:51,844] INFO Awaiting socket connections on 0.0.0.0:9092<http://0.0.0.0:9092/>. (kafka.network.Acceptor) [2015-10-15 16:07:51,845] INFO [Socket Server on Broker 0], Started (kafka.network.SocketServer) [2015-10-15 16:07:51,903] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$) [2015-10-15 16:07:51,930] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector) [2015-10-15 16:07:51,996] INFO Registered broker 0 at path /brokers/ids/0 with address 10.248.15.104:9092<http://10.248.15.104:9092/>. (kafka.utils.ZkUtils$) [2015-10-15 16:07:52,011] INFO [Kafka Server 0], started (kafka.server.KafkaServer) [2015-10-15 16:07:52,050] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) And Zookeeper as well: 2015-10-15 17:17:34,334 [myid:] - INFO [main:Environment@100] - Client environment:os.name<http://os.name/>=Mac OS X 2015-10-15 17:17:34,334 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=x86_64 2015-10-15 17:17:34,334 [myid:] - INFO [main:Environment@100] - Client environment:os.version=10.11 2015-10-15 17:17:34,334 [myid:] - INFO [main:Environment@100] - Client environment:user.name<http://user.name/>=johncalvo 2015-10-15 17:17:34,334 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/Users/johncalvo 2015-10-15 17:17:34,335 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/usr/local/zookeeper-3.4.6 2015-10-15 17:17:34,336 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=127.0.0.1:2181<http://127.0.0.1:2181/> sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@69d0a921 Welcome to ZooKeeper! 2015-10-15 17:17:34,372 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@975] - Opening socket connection to server 127.0.0.1/127.0.0.1:2181<http://127.0.0.1/127.0.0.1:2181>. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2015-10-15 17:17:34,463 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@852] - Socket connection established to 127.0.0.1/127.0.0.1:2181<http://127.0.0.1/127.0.0.1:2181>, initiating session [zk: 127.0.0.1:2181(CONNECTING) 0] 2015-10-15 17:17:34,546 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1235] - Session establishment complete on server 127.0.0.1/127.0.0.1:2181<http://127.0.0.1/127.0.0.1:2181>, sessionid = 0x15069dd8c1d0001, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null What can we do? Let me know your comments. PD: In addition, we would like to know if you are planning to implement SAMOA on other SPE…. Thank you. All the best [X] John Calvo M.Sc. B.Eng. PhD Student School of Computer Science and Engineering UNSW AUSTRALIA Building K17 Room 301-04 SYDNEY 2052 NSW Australia W: www.computing.unsw.edu.au<http://www.computing.unsw.edu.au/> FB: https://www.facebook.com/UNSW.COMPUTING TW: @UNSWCOMPUTING G+: UNSW CSE E: [email protected]<mailto:[email protected]> P: (+61) 2 9385 6916<tel:%2B61%29%202%209385%206916> (Internal: x56916) M: (+61) 04 5161 4230 On 7 Aug 2015, at 18:20, Gianmarco De Francisci Morales <[email protected]<mailto:[email protected]>> wrote: Redirecting John's question to the mailing list. John, seems the script cannot find the jar. Have you compiled it from the current master? If so, the jar should be "SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar". Most of the examples and docs need to be updated given that we recently made a new release. -- Gianmarco On 6 August 2015 at 11:20, John Calvo <[email protected]<mailto:[email protected]>> wrote: Hi Gianmarco, I hope you are well, I’m writing you because I was trying to explore SAMOA but I got an error running the prequential example: bin/samoa local target/SAMOA-Local-0.3.0-SNAPSHOT.jar "PrequentialEvaluation -l classifiers.ensemble.Bagging -s (ArffFileStream -f covtypeNorm.arff) -f 100000" bin/samoa Deploying to LOCAL Error: Could not find or load main class org.apache.samoa.LocalDoTask Do you know what would be missed? I tried a local installation, without SPE…. Any help would be appreciated! Best regards, <0.png> John Calvo M.Sc. B.Eng. PhD Student School of Computer Science and Engineering UNSW AUSTRALIA Building K17 Room 301-04 SYDNEY 2052 NSW Australia W: www.computing.unsw.edu.au<http://www.computing.unsw.edu.au/> FB: https://www.facebook.com/UNSW.COMPUTING TW: @UNSWCOMPUTING G+: UNSW CSE E: [email protected]<mailto:[email protected]> P: (+61) 2 9385 6916<tel:%2B61%29%202%209385%206916> (Internal: x56916) M: (+61) 04 5161 4230 -- Nicolas Kourtellis
