Julien, I have put up pull requests for the docs and for fixing some of the issues with LocalCluster that you found.
https://github.com/apache/storm/pull/2891 https://github.com/apache/storm/pull/2892 The VersionInfo change is a blocker and we should fix it before releasing (Sorry Taylor). For the other stuff if you find more issues we can move it to a different thread and work through them. Thanks, Bobby On Mon, Oct 22, 2018 at 9:23 AM Bobby Evans <ev...@oath.com> wrote: > I'll look at upgrading that version of http client too. > > On Mon, Oct 22, 2018 at 9:15 AM Julien Nioche < > lists.digitalpeb...@gmail.com> wrote: > >> Hi, >> >> I've looked into it a bit more and found that SC had a dependency on >> storm-core and not storm-client; I've fixed this in 40612a3... >> < >> https://github.com/DigitalPebble/storm-crawler/commit/40612a3588d66e1d410a70b1c7e5db58d5c2ba4d >> > >> however >> this doesn't affect the issues I had last week. >> >> *httpclient dependency conflict* >> As seen last week, this is not shaded by Storm and the version used (4.3.3 >> < >> https://github.com/apache/storm/blob/ce984cd31a16e7fe4b983659005f1f7648455404/pom.xml#L266 >> >) >> is quite old. Even within Storm, the Storm-SOLR module uses a more recent >> one (4.5 >> < >> https://github.com/apache/storm/blob/master/external/storm-solr/pom.xml#L64 >> >). >> StormCrawler needs at least 4.5.5 >> < >> https://github.com/DigitalPebble/storm-crawler/blob/master/core/pom.xml#L26 >> >. >> I expect other Storm users would use *httpclient* and have a similar >> problem. Unless I am missing something, I can see the following solutions >> sorted by how convenient they are to me as a user: >> >> 1. the dependency is shaded by Storm >> 2. the dependency is upgraded to 4.5.5 by Storm >> 3. the dependency is shaded by StormCrawler >> >> Obviously, I'd rather not have to deal with (3) and anyone using >> httpclient with Storm would have to do the same. >> >> Note: I can get my topology to work by specifying a protocol >> implementation >> based on OKHttp >> * http.protocol.implementation: >> "com.digitalpebble.stormcrawler.protocol.okhttp.HttpProtocol"* >> * https.protocol.implementation: >> "com.digitalpebble.stormcrawler.protocol.okhttp.HttpProtocol"* >> >> *LocalCluster* >> Since removing the dependency on storm-core, I can't use LocalCluster >> directly. I'll create a separate branch on my test repo to try to >> reproduce >> the issue. >> >> *Documentation for Local mode* >> http://storm.apache.org/releases/2.0.0-SNAPSHOT/Local-mode.html >> does not mention *--local-ttl *would be good to document it and indicate >> what the default value is otherwise users might wonder why their >> topologies >> run for 20 secs only. Personally, I'd rather be able to have a default >> behaviour where the topology runs forever or at least be able to >> deactivate >> the TTL completely by setting it to -1. >> >> *ConfigurableTopology* >> I am getting a different behavior between the original >> ConfigurableTopology from >> StormCrawler >> < >> https://github.com/DigitalPebble/storm-crawler/blob/master/core/src/main/java/com/digitalpebble/stormcrawler/ConfigurableTopology.java >> > >> and when I extend the one in Storm >> < >> https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/topology/ConfigurableTopology.java >> >; >> with the latter, any configuration found in the conf files passed in args >> to the command line are added to the default values I provide instead of >> replacing them. I'll investigate that further and open an issue if I find >> a >> bug. >> >> *Distributed mode* >> I managed to launch the various services and run my test topology in >> remote >> mode (by changing the protocol implementation as explained above) >> >> *Flux* >> http://storm.apache.org/releases/2.0.0-SNAPSHOT/flux.html tells me to run >> >> storm jar myTopology-0.1.0-SNAPSHOT.jar org.apache.storm.flux.Flux >> --local my_config.yaml >> >> >> >> *apache-storm-2.0.0/bin/storm jar target/2-1.0-SNAPSHOT.jar >> org.apache.storm.flux.Flux --local crawler.flux* >> >> but am getting >> >> *15:07:26.206 [main] ERROR o.a.s.f.Flux - To run in local mode run with >> 'storm local' instead of 'storm jar'* >> >> *so *I tried both >> >> apache-storm-2.0.0/bin/storm local target/2-1.0-SNAPSHOT.jar >> org.apache.storm.flux.Flux --local crawler.flux >> >> and >> >> *apache-storm-2.0.0/bin/storm local target/2-1.0-SNAPSHOT.jar >> org.apache.storm.flux.Flux crawler.flux* >> but in both cases I'm getting >> >> *15:12:06.784 [main] ERROR o.a.s.f.Flux - To run in local mode run with >> 'storm local' instead of 'storm jar'* >> *15:12:06.784 [main] INFO o.a.s.LocalCluster - * >> >> * RUNNING LOCAL CLUSTER for 20 seconds.* >> >> and nothing happens, the topology just dies after 20secs without feching >> any URLs. >> >> I haven't tried Flux in distributed mode yet. >> >> Thanks! >> >> Julien >> >> PS: my test topology is in https://github.com/DigitalPebble/storm2 >> >> >> >> >> >> >> >> >> On Fri, 19 Oct 2018 at 19:32, Julien Nioche < >> lists.digitalpeb...@gmail.com> >> wrote: >> >> > Hi Bobby >> > >> > The dependency issue happens when I have only storm-client as a >> dependency >> > and not server. >> > >> > When trying to run it from Eclipse I had to add server to the pom, as >> > expected but also client as I was getting >> > >> > 19:22:13.044 [main] ERROR o.a.s.u.VersionInfo - Could not load >> > storm-core-version-info.properties >> > java.io.IOException: Resource not found >> > at >> > >> org.apache.storm.utils.VersionInfo$VersionInfoImpl.<init>(VersionInfo.java:53) >> > [storm-client-2.0.0.jar:2.0.0] >> > at org.apache.storm.utils.VersionInfo.<clinit>(VersionInfo.java:41) >> > [storm-client-2.0.0.jar:2.0.0] >> > at org.apache.storm.daemon.nimbus.Nimbus.<clinit>(Nimbus.java:281) >> > [storm-server-2.0.0.jar:2.0.0] >> > at org.apache.storm.LocalCluster.<init>(LocalCluster.java:235) >> > [storm-server-2.0.0.jar:2.0.0] >> > at org.apache.storm.LocalCluster.<init>(LocalCluster.java:156) >> > [storm-server-2.0.0.jar:2.0.0] >> > at >> > >> com.digitalpebble.stormcrawler.ConfigurableTopology.submit(ConfigurableTopology.java:74) >> > [classes/:?] >> > at com.dipe.sc.CrawlTopology.run(CrawlTopology.java:80) [classes/:?] >> > at >> > >> com.digitalpebble.stormcrawler.ConfigurableTopology.start(ConfigurableTopology.java:49) >> > [classes/:?] >> > at com.dipe.sc.CrawlTopology.main(CrawlTopology.java:39) [classes/:?] >> > >> > I've put the code in https://github.com/DigitalPebble/storm2 if you >> want >> > to have a look. You'll need to compile the branch 2.x of SC first >> > https://github.com/DigitalPebble/storm-crawler/tree/2.x >> > >> > To reproduce the ZK issue, open the project in Eclipse and run the >> > CrawlTopology class with "-local -conf crawler-conf.yaml" in arguments. >> > >> > For the dependency problem, mvn clean package followed by >> > /data/apache-storm-2.0.0/bin/storm local target/2-1.0-SNAPSHOT.jar >> > com.dipe.sc.CrawlTopology -conf crawler-conf.yaml >> > should give java.lang.NoSuchMethodError: >> > >> org.apache.http.impl.client.HttpClientBuilder.setConnectionManagerShared(Z)Lorg/apache/http/impl/client/HttpClientBuilder; >> > >> > Thanks >> > >> > Julien >> > >> > On Fri, 19 Oct 2018 at 17:26, Bobby Evans <bo...@apache.org> wrote: >> > >> >> Sorry I should clarify a bit. >> >> >> >> `storm local` will run things in local mode, but the classpath will >> >> include >> >> things that are not shaded. >> >> >> >> This is also true for trying to run tests from eclipse. LocalCluster >> is a >> >> part of storm-server so you will need to pull that in just for testing. >> >> storm-client is what you want to depend on for the majority of your >> >> topology. >> >> >> >> The ZK issue is new to me We have done a lot in local mode and not >> seen >> >> that as an issue. If you can help me reproduce it I am happy to try >> and >> >> debug it to see what is happening. >> >> >> >> Thanks, >> >> >> >> Bobby >> >> >> >> On Fri, Oct 19, 2018 at 11:21 AM Bobby Evans <bo...@apache.org> wrote: >> >> >> >> > It is shaded in storm 2.x, but we split the classpath up, so what you >> >> want >> >> > to depend on is storm-client only. I see you are pulling in >> storm-core >> >> and >> >> > a few other things that are not shaded, because they are only used by >> >> the >> >> > daemons, not the clients. >> >> > >> >> > On Fri, Oct 19, 2018 at 10:55 AM Julien Nioche < >> >> > lists.digitalpeb...@gmail.com> wrote: >> >> > >> >> >> Sorry, hit Return too quickly >> >> >> >> >> >> I am testing Storm 2.0.0 with StormCrawler, not very successfully. >> One >> >> >> immediate issue is that I am getting a version conflict on >> httpclient >> >> as >> >> >> the version set by Storm is older than the one I need. >> >> >> >> >> >> java.lang.NoSuchMethodError: >> >> >> >> >> >> >> >> >> org.apache.http.impl.client.HttpClientBuilder.setConnectionManagerShared(Z)Lorg/apache/http/impl/client/HttpClientBuilder; >> >> >> at >> >> >> >> >> >> >> >> >> com.digitalpebble.stormcrawler.protocol.httpclient.HttpProtocol.configure(HttpProtocol.java:141) >> >> >> ~[2-1.0-SNAPSHOT.jar:?] >> >> >> at >> >> >> >> >> >> >> >> >> com.digitalpebble.stormcrawler.protocol.ProtocolFactory.<init>(ProtocolFactory.java:69) >> >> >> ~[2-1.0-SNAPSHOT.jar:?] >> >> >> at >> >> >> >> >> >> >> >> >> com.digitalpebble.stormcrawler.bolt.FetcherBolt.prepare(FetcherBolt.java:760) >> >> >> ~[2-1.0-SNAPSHOT.jar:?] >> >> >> at >> >> org.apache.storm.executor.bolt.BoltExecutor.init(BoltExecutor.java:144) >> >> >> ~[storm-client-2.0.0.jar:2.0.0] >> >> >> at >> >> org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:154) >> >> >> ~[storm-client-2.0.0.jar:2.0.0] >> >> >> at >> >> org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:58) >> >> >> ~[storm-client-2.0.0.jar:2.0.0] >> >> >> at org.apache.storm.utils.Utils$1.run(Utils.java:353) >> >> >> [storm-client-2.0.0.jar:2.0.0] >> >> >> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191] >> >> >> >> >> >> Here is the classpath when calling *storm local ....* >> >> >> >> >> >> *16:38:03.445 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Client >> >> >> >> >> >> >> >> >> environment:java.class.path=/data/apache-storm-2.0.0/*:/data/apache-storm-2.0.0/lib/log4j-over-slf4j-1.6.6.jar:/data/apache-storm-2.0.0/lib/hadoop-auth-2.6.1.jar:/data/apache-storm-2.0.0/lib/jaxb-api-2.3.0.jar:/data/apache-storm-2.0.0/lib/kryo-shaded-3.0.3.jar:/data/apache-storm-2.0.0/lib/kryo-3.0.3.jar:/data/apache-storm-2.0.0/lib/commons-cli-1.4.jar:/data/apache-storm-2.0.0/lib/log4j-slf4j-impl-2.11.1.jar:/data/apache-storm-2.0.0/lib/jetty-continuation-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/httpclient-4.3.3.jar:/data/apache-storm-2.0.0/lib/commons-io-2.6.jar:/data/apache-storm-2.0.0/lib/commons-collections-3.2.2.jar:/data/apache-storm-2.0.0/lib/guava-16.0.1.jar:/data/apache-storm-2.0.0/lib/metrics-graphite-3.2.6.jar:/data/apache-storm-2.0.0/lib/jetty-http-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/tools.logging-0.2.3.jar:/data/apache-storm-2.0.0/lib/jetty-util-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/rocksdbjni-5.8.6.jar:/data/apache-storm-2.0.0/lib/commons-fileupload-1.3.3.jar:/data/apache-storm-2.0.0/lib/curator-framework-4.0.1.jar:/data/apache-storm-2.0.0/lib/jackson-dataformat-smile-2.9.4.jar:/data/apache-storm-2.0.0/lib/asm-5.0.3.jar:/data/apache-storm-2.0.0/lib/jetty-io-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/chill-java-0.8.0.jar:/data/apache-storm-2.0.0/lib/curator-client-4.0.1.jar:/data/apache-storm-2.0.0/lib/httpcore-4.3.2.jar:/data/apache-storm-2.0.0/lib/log4j-api-2.11.1.jar:/data/apache-storm-2.0.0/lib/jetty-security-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/storm-clojure-2.0.0.jar:/data/apache-storm-2.0.0/lib/commons-compress-1.16.1.jar:/data/apache-storm-2.0.0/lib/jetty-server-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/netty-3.7.0.Final.jar:/data/apache-storm-2.0.0/lib/json-simple-1.1.jar:/data/apache-storm-2.0.0/lib/junit-4.12.jar:/data/apache-storm-2.0.0/lib/jetty-servlet-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/objenesis-2.6.jar:/data/apache-storm-2.0.0/lib/jetty-servlets-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/carbonite-1.5.0.jar:/data/apache-storm-2.0.0/lib/storm-server-2.0.0.jar:/data/apache-storm-2.0.0/lib/shaded-deps-2.0.0.jar:/data/apache-storm-2.0.0/lib/javax.servlet-api-3.1.0.jar:/data/apache-storm-2.0.0/lib/commons-logging-1.1.3.jar:/data/apache-storm-2.0.0/lib/jline-0.9.94.jar:/data/apache-storm-2.0.0/lib/storm-client-2.0.0.jar:/data/apache-storm-2.0.0/lib/snakeyaml-1.11.jar:/data/apache-storm-2.0.0/lib/hamcrest-core-1.3.jar:/data/apache-storm-2.0.0/lib/minlog-1.3.0.jar:/data/apache-storm-2.0.0/lib/slf4j-api-1.7.21.jar:/data/apache-storm-2.0.0/lib/log4j-core-2.11.1.jar:/data/apache-storm-2.0.0/lib/commons-exec-1.3.jar:/data/apache-storm-2.0.0/lib/storm-core-2.0.0.jar:/data/apache-storm-2.0.0/lib/jackson-core-2.9.4.jar:/data/apache-storm-2.0.0/lib/zookeeper-3.4.6.jar:/data/apache-storm-2.0.0/lib/commons-lang-2.6.jar:/data/apache-storm-2.0.0/lib/clojure-1.7.0.jar:/data/apache-storm-2.0.0/lib/metrics-core-3.2.6.jar:/data/apache-storm-2.0.0/lib/reflectasm-1.10.1.jar:/data/apache-storm-2.0.0/lib/commons-codec-1.11.jar:/data/apache-storm-2.0.0/lib/joda-time-2.3.jar:/data/apache-storm-2.0.0/extlib/*:target/2-1.0-SNAPSHOT.jar:/data/apache-storm-2.0.0/conf:/data/apache-storm-2.0.0/bin* >> >> >> >> >> >> This doesn't happen with Storm 1.2.2. Aren't these libs supposed to >> be >> >> >> shaded by Storm? >> >> >> >> >> >> Another issue is when I try to launch a topology from Eclipse (as I >> was >> >> >> able to do with Storm 1.x), even when adding >> >> >> >> >> >> *<dependency>* >> >> >> * <groupId>org.apache.storm</groupId>* >> >> >> * <artifactId>storm-server</artifactId>* >> >> >> * <version>2.0.0</version>* >> >> >> * </dependency>* >> >> >> * <dependency>* >> >> >> * <groupId>org.apache.storm</groupId>* >> >> >> * <artifactId>storm-core</artifactId>* >> >> >> * <version>2.0.0</version>* >> >> >> * </dependency>* >> >> >> >> >> >> as suggested by >> >> >> http://storm.apache.org/releases/2.0.0-SNAPSHOT/Local-mode.html, >> there >> >> >> seems to be an issue with ZK. The 2nd dependency is not mentioned on >> >> that >> >> >> page but seems to be needed. >> >> >> >> >> >> *16:50:53.041 [ProcessThread(sid:0 cport:-1):] INFO >> >> >> o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level >> KeeperException >> >> when >> >> >> processing sessionid:0x1668d05b0630007 type:create cxid:0x2 >> zxid:0x28 >> >> >> txntype:-1 reqpath:n/a Error >> Path:/storm/blobstoremaxkeysequencenumber >> >> >> Error:KeeperErrorCode = NoNode for >> >> /storm/blobstoremaxkeysequencenumber* >> >> >> >> >> >> and the topology never starts. I could, of course, rely on "storm >> >> local" >> >> >> but being able to run a local topology without installing Storm is >> >> quite >> >> >> nice for users who just want to give it a try. >> >> >> >> >> >> Any thoughts? >> >> >> >> >> >> Julien >> >> >> >> >> >> >> >> >> On Fri, 19 Oct 2018 at 16:40, Julien Nioche < >> >> >> lists.digitalpeb...@gmail.com> >> >> >> wrote: >> >> >> >> >> >> > Hi, >> >> >> > >> >> >> > I am testing Storm 2.0.0 with StormCrawler, not very successfully >> >> >> > >> >> >> > On Tue, 16 Oct 2018 at 20:48, P. Taylor Goetz <ptgo...@gmail.com> >> >> >> wrote: >> >> >> > >> >> >> >> This is a call to vote on releasing Apache Storm 2.0.0 (rc3) >> >> >> >> >> >> >> >> Full list of changes in this release: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc3/RELEASE_NOTES.html >> >> >> >> >> >> >> >> The tag/commit to be voted upon is v2.0.0: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=commit;h=d2d6f40344e6cc92ab07f3a462d577ef6b61f8b1 >> >> >> >> >> >> >> >> The source archive being voted upon can be found here: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc3/apache-storm-2.0.0-src.tar.gz >> >> >> >> >> >> >> >> Other release files, signatures and digests can be found here: >> >> >> >> >> >> >> >> >> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc3/ >> >> >> >> >> >> >> >> The release artifacts are signed with the following key: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd >> >> >> >> >> >> >> >> The Nexus staging repository for this release is: >> >> >> >> >> >> >> >> >> >> https://repository.apache.org/content/repositories/orgapachestorm-1072 >> >> >> >> >> >> >> >> Please vote on releasing this package as Apache Storm 2.0.0. >> >> >> >> >> >> >> >> When voting, please list the actions taken to verify the release. >> >> >> >> >> >> >> >> This vote will be open for at least 72 hours. >> >> >> >> >> >> >> >> [ ] +1 Release this package as Apache Storm 2.0.0 >> >> >> >> [ ] 0 No opinion >> >> >> >> [ ] -1 Do not release this package because... >> >> >> >> >> >> >> >> Thanks to everyone who contributed to this release. >> >> >> >> >> >> >> >> -Taylor >> >> >> >> >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > >> >> >> > *Open Source Solutions for Text Engineering* >> >> >> > >> >> >> > http://www.digitalpebble.com >> >> >> > http://digitalpebble.blogspot.com/ >> >> >> > #digitalpebble <http://twitter.com/digitalpebble> >> >> >> > >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> >> *Open Source Solutions for Text Engineering* >> >> >> >> >> >> http://www.digitalpebble.com >> >> >> http://digitalpebble.blogspot.com/ >> >> >> #digitalpebble <http://twitter.com/digitalpebble> >> >> >> >> >> > >> >> >> > >> > >> > -- >> > >> > *Open Source Solutions for Text Engineering* >> > >> > http://www.digitalpebble.com >> > http://digitalpebble.blogspot.com/ >> > #digitalpebble <http://twitter.com/digitalpebble> >> > >> >> >> -- >> >> *Open Source Solutions for Text Engineering* >> >> http://www.digitalpebble.com >> http://digitalpebble.blogspot.com/ >> #digitalpebble <http://twitter.com/digitalpebble> >> >