Julien,

I have put up pull requests for the docs and for fixing some of the issues
with LocalCluster that you found.

https://github.com/apache/storm/pull/2891

https://github.com/apache/storm/pull/2892

The VersionInfo change is a blocker and we should fix it before releasing
(Sorry Taylor).

For the other stuff if you find more issues we can move it to a different
thread and work through them.

Thanks,

Bobby

On Mon, Oct 22, 2018 at 9:23 AM Bobby Evans <ev...@oath.com> wrote:

> I'll look at upgrading that version of http client too.
>
> On Mon, Oct 22, 2018 at 9:15 AM Julien Nioche <
> lists.digitalpeb...@gmail.com> wrote:
>
>> Hi,
>>
>> I've looked into it a bit more and found that SC had a dependency on
>> storm-core and not storm-client; I've fixed this in 40612a3...
>> <
>> https://github.com/DigitalPebble/storm-crawler/commit/40612a3588d66e1d410a70b1c7e5db58d5c2ba4d
>> >
>> however
>> this doesn't affect the issues I had last week.
>>
>> *httpclient dependency conflict*
>> As seen last week, this is not shaded by Storm and the version used (4.3.3
>> <
>> https://github.com/apache/storm/blob/ce984cd31a16e7fe4b983659005f1f7648455404/pom.xml#L266
>> >)
>> is quite old. Even within Storm, the Storm-SOLR module uses a more recent
>> one (4.5
>> <
>> https://github.com/apache/storm/blob/master/external/storm-solr/pom.xml#L64
>> >).
>> StormCrawler needs at least 4.5.5
>> <
>> https://github.com/DigitalPebble/storm-crawler/blob/master/core/pom.xml#L26
>> >.
>> I expect other Storm users would use *httpclient* and have a similar
>> problem. Unless I am missing something, I can see the following solutions
>> sorted by how convenient they are to me as a user:
>>
>>    1. the dependency is shaded by Storm
>>    2. the dependency is upgraded to 4.5.5 by Storm
>>    3. the dependency is shaded by StormCrawler
>>
>> Obviously, I'd rather not have to deal with (3) and anyone using
>> httpclient with Storm would have to do the same.
>>
>> Note: I can get my topology to work by specifying a protocol
>> implementation
>> based on OKHttp
>> *  http.protocol.implementation:
>> "com.digitalpebble.stormcrawler.protocol.okhttp.HttpProtocol"*
>> *  https.protocol.implementation:
>> "com.digitalpebble.stormcrawler.protocol.okhttp.HttpProtocol"*
>>
>> *LocalCluster*
>> Since removing the dependency on storm-core, I can't use LocalCluster
>> directly. I'll create a separate branch on my test repo to try to
>> reproduce
>> the issue.
>>
>> *Documentation for Local mode*
>> http://storm.apache.org/releases/2.0.0-SNAPSHOT/Local-mode.html
>> does not mention *--local-ttl *would be good to document it and indicate
>> what the default value is otherwise users might wonder why their
>> topologies
>> run for 20 secs only.  Personally, I'd rather be able to have a default
>> behaviour where the topology runs forever or at least be able to
>> deactivate
>> the TTL completely by setting it to -1.
>>
>> *ConfigurableTopology*
>> I am getting a different behavior between the original
>> ConfigurableTopology from
>> StormCrawler
>> <
>> https://github.com/DigitalPebble/storm-crawler/blob/master/core/src/main/java/com/digitalpebble/stormcrawler/ConfigurableTopology.java
>> >
>> and when I extend the one in Storm
>> <
>> https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/topology/ConfigurableTopology.java
>> >;
>> with the latter, any configuration found in the conf files passed in args
>> to the command line are added to the default values I provide instead of
>> replacing them. I'll investigate that further and open an issue if I find
>> a
>> bug.
>>
>> *Distributed mode*
>> I managed to launch the various services and run my test topology in
>> remote
>> mode (by changing the protocol implementation as explained above)
>>
>> *Flux*
>> http://storm.apache.org/releases/2.0.0-SNAPSHOT/flux.html tells me to run
>>
>> storm jar myTopology-0.1.0-SNAPSHOT.jar org.apache.storm.flux.Flux
>> --local my_config.yaml
>>
>>
>>
>> *apache-storm-2.0.0/bin/storm jar target/2-1.0-SNAPSHOT.jar
>> org.apache.storm.flux.Flux --local crawler.flux*
>>
>> but am getting
>>
>> *15:07:26.206 [main] ERROR o.a.s.f.Flux - To run in local mode run with
>> 'storm local' instead of 'storm jar'*
>>
>> *so *I tried both
>>
>> apache-storm-2.0.0/bin/storm local target/2-1.0-SNAPSHOT.jar
>> org.apache.storm.flux.Flux --local crawler.flux
>>
>> and
>>
>> *apache-storm-2.0.0/bin/storm local target/2-1.0-SNAPSHOT.jar
>> org.apache.storm.flux.Flux crawler.flux*
>> but in both cases I'm getting
>>
>> *15:12:06.784 [main] ERROR o.a.s.f.Flux - To run in local mode run with
>> 'storm local' instead of 'storm jar'*
>> *15:12:06.784 [main] INFO  o.a.s.LocalCluster - *
>>
>> * RUNNING LOCAL CLUSTER for 20 seconds.*
>>
>> and nothing happens, the topology just dies after 20secs without feching
>> any URLs.
>>
>> I haven't tried Flux in distributed mode yet.
>>
>> Thanks!
>>
>> Julien
>>
>> PS: my test topology is in https://github.com/DigitalPebble/storm2
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, 19 Oct 2018 at 19:32, Julien Nioche <
>> lists.digitalpeb...@gmail.com>
>> wrote:
>>
>> > Hi Bobby
>> >
>> > The dependency issue happens when I have only storm-client as a
>> dependency
>> > and not server.
>> >
>> > When trying to run it from Eclipse I had to add server to the pom, as
>> > expected but also client as I was getting
>> >
>> > 19:22:13.044 [main] ERROR o.a.s.u.VersionInfo - Could not load
>> > storm-core-version-info.properties
>> > java.io.IOException: Resource not found
>> > at
>> >
>> org.apache.storm.utils.VersionInfo$VersionInfoImpl.<init>(VersionInfo.java:53)
>> > [storm-client-2.0.0.jar:2.0.0]
>> > at org.apache.storm.utils.VersionInfo.<clinit>(VersionInfo.java:41)
>> > [storm-client-2.0.0.jar:2.0.0]
>> > at org.apache.storm.daemon.nimbus.Nimbus.<clinit>(Nimbus.java:281)
>> > [storm-server-2.0.0.jar:2.0.0]
>> > at org.apache.storm.LocalCluster.<init>(LocalCluster.java:235)
>> > [storm-server-2.0.0.jar:2.0.0]
>> > at org.apache.storm.LocalCluster.<init>(LocalCluster.java:156)
>> > [storm-server-2.0.0.jar:2.0.0]
>> > at
>> >
>> com.digitalpebble.stormcrawler.ConfigurableTopology.submit(ConfigurableTopology.java:74)
>> > [classes/:?]
>> > at com.dipe.sc.CrawlTopology.run(CrawlTopology.java:80) [classes/:?]
>> > at
>> >
>> com.digitalpebble.stormcrawler.ConfigurableTopology.start(ConfigurableTopology.java:49)
>> > [classes/:?]
>> > at com.dipe.sc.CrawlTopology.main(CrawlTopology.java:39) [classes/:?]
>> >
>> > I've put the code in https://github.com/DigitalPebble/storm2  if you
>> want
>> > to have a look. You'll need to compile the branch 2.x of SC first
>> > https://github.com/DigitalPebble/storm-crawler/tree/2.x
>> >
>> > To reproduce the ZK issue, open the project in Eclipse and run the
>> > CrawlTopology class with "-local -conf crawler-conf.yaml" in arguments.
>> >
>> > For the dependency problem, mvn clean package followed by
>> > /data/apache-storm-2.0.0/bin/storm local target/2-1.0-SNAPSHOT.jar
>> > com.dipe.sc.CrawlTopology -conf crawler-conf.yaml
>> > should give java.lang.NoSuchMethodError:
>> >
>> org.apache.http.impl.client.HttpClientBuilder.setConnectionManagerShared(Z)Lorg/apache/http/impl/client/HttpClientBuilder;
>> >
>> > Thanks
>> >
>> > Julien
>> >
>> > On Fri, 19 Oct 2018 at 17:26, Bobby Evans <bo...@apache.org> wrote:
>> >
>> >> Sorry I should clarify a bit.
>> >>
>> >> `storm local` will run things in local mode, but the classpath will
>> >> include
>> >> things that are not shaded.
>> >>
>> >> This is also true for trying to run tests from eclipse.  LocalCluster
>> is a
>> >> part of storm-server so you will need to pull that in just for testing.
>> >> storm-client is what you want to depend on for the majority of your
>> >> topology.
>> >>
>> >> The ZK issue is new to me  We have done a lot in local mode and not
>> seen
>> >> that as an issue.  If you can help me reproduce it I am happy to try
>> and
>> >> debug it to see what is happening.
>> >>
>> >> Thanks,
>> >>
>> >> Bobby
>> >>
>> >> On Fri, Oct 19, 2018 at 11:21 AM Bobby Evans <bo...@apache.org> wrote:
>> >>
>> >> > It is shaded in storm 2.x, but we split the classpath up, so what you
>> >> want
>> >> > to depend on is storm-client only.  I see you are pulling in
>> storm-core
>> >> and
>> >> > a few other things that are not shaded, because they are only used by
>> >> the
>> >> > daemons, not the clients.
>> >> >
>> >> > On Fri, Oct 19, 2018 at 10:55 AM Julien Nioche <
>> >> > lists.digitalpeb...@gmail.com> wrote:
>> >> >
>> >> >> Sorry, hit Return too quickly
>> >> >>
>> >> >> I am testing Storm 2.0.0 with StormCrawler, not very successfully.
>> One
>> >> >> immediate issue is that I am getting a version conflict on
>> httpclient
>> >> as
>> >> >> the version set by Storm is older than the one I need.
>> >> >>
>> >> >> java.lang.NoSuchMethodError:
>> >> >>
>> >> >>
>> >>
>> org.apache.http.impl.client.HttpClientBuilder.setConnectionManagerShared(Z)Lorg/apache/http/impl/client/HttpClientBuilder;
>> >> >> at
>> >> >>
>> >> >>
>> >>
>> com.digitalpebble.stormcrawler.protocol.httpclient.HttpProtocol.configure(HttpProtocol.java:141)
>> >> >> ~[2-1.0-SNAPSHOT.jar:?]
>> >> >> at
>> >> >>
>> >> >>
>> >>
>> com.digitalpebble.stormcrawler.protocol.ProtocolFactory.<init>(ProtocolFactory.java:69)
>> >> >> ~[2-1.0-SNAPSHOT.jar:?]
>> >> >> at
>> >> >>
>> >> >>
>> >>
>> com.digitalpebble.stormcrawler.bolt.FetcherBolt.prepare(FetcherBolt.java:760)
>> >> >> ~[2-1.0-SNAPSHOT.jar:?]
>> >> >> at
>> >> org.apache.storm.executor.bolt.BoltExecutor.init(BoltExecutor.java:144)
>> >> >> ~[storm-client-2.0.0.jar:2.0.0]
>> >> >> at
>> >> org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:154)
>> >> >> ~[storm-client-2.0.0.jar:2.0.0]
>> >> >> at
>> >> org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:58)
>> >> >> ~[storm-client-2.0.0.jar:2.0.0]
>> >> >> at org.apache.storm.utils.Utils$1.run(Utils.java:353)
>> >> >> [storm-client-2.0.0.jar:2.0.0]
>> >> >> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
>> >> >>
>> >> >> Here is the classpath when calling *storm local ....*
>> >> >>
>> >> >> *16:38:03.445 [main] INFO  o.a.s.s.o.a.z.ZooKeeper - Client
>> >> >>
>> >> >>
>> >>
>> environment:java.class.path=/data/apache-storm-2.0.0/*:/data/apache-storm-2.0.0/lib/log4j-over-slf4j-1.6.6.jar:/data/apache-storm-2.0.0/lib/hadoop-auth-2.6.1.jar:/data/apache-storm-2.0.0/lib/jaxb-api-2.3.0.jar:/data/apache-storm-2.0.0/lib/kryo-shaded-3.0.3.jar:/data/apache-storm-2.0.0/lib/kryo-3.0.3.jar:/data/apache-storm-2.0.0/lib/commons-cli-1.4.jar:/data/apache-storm-2.0.0/lib/log4j-slf4j-impl-2.11.1.jar:/data/apache-storm-2.0.0/lib/jetty-continuation-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/httpclient-4.3.3.jar:/data/apache-storm-2.0.0/lib/commons-io-2.6.jar:/data/apache-storm-2.0.0/lib/commons-collections-3.2.2.jar:/data/apache-storm-2.0.0/lib/guava-16.0.1.jar:/data/apache-storm-2.0.0/lib/metrics-graphite-3.2.6.jar:/data/apache-storm-2.0.0/lib/jetty-http-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/tools.logging-0.2.3.jar:/data/apache-storm-2.0.0/lib/jetty-util-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/rocksdbjni-5.8.6.jar:/data/apache-storm-2.0.0/lib/commons-fileupload-1.3.3.jar:/data/apache-storm-2.0.0/lib/curator-framework-4.0.1.jar:/data/apache-storm-2.0.0/lib/jackson-dataformat-smile-2.9.4.jar:/data/apache-storm-2.0.0/lib/asm-5.0.3.jar:/data/apache-storm-2.0.0/lib/jetty-io-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/chill-java-0.8.0.jar:/data/apache-storm-2.0.0/lib/curator-client-4.0.1.jar:/data/apache-storm-2.0.0/lib/httpcore-4.3.2.jar:/data/apache-storm-2.0.0/lib/log4j-api-2.11.1.jar:/data/apache-storm-2.0.0/lib/jetty-security-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/storm-clojure-2.0.0.jar:/data/apache-storm-2.0.0/lib/commons-compress-1.16.1.jar:/data/apache-storm-2.0.0/lib/jetty-server-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/netty-3.7.0.Final.jar:/data/apache-storm-2.0.0/lib/json-simple-1.1.jar:/data/apache-storm-2.0.0/lib/junit-4.12.jar:/data/apache-storm-2.0.0/lib/jetty-servlet-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/objenesis-2.6.jar:/data/apache-storm-2.0.0/lib/jetty-servlets-9.4.7.v20170914.jar:/data/apache-storm-2.0.0/lib/carbonite-1.5.0.jar:/data/apache-storm-2.0.0/lib/storm-server-2.0.0.jar:/data/apache-storm-2.0.0/lib/shaded-deps-2.0.0.jar:/data/apache-storm-2.0.0/lib/javax.servlet-api-3.1.0.jar:/data/apache-storm-2.0.0/lib/commons-logging-1.1.3.jar:/data/apache-storm-2.0.0/lib/jline-0.9.94.jar:/data/apache-storm-2.0.0/lib/storm-client-2.0.0.jar:/data/apache-storm-2.0.0/lib/snakeyaml-1.11.jar:/data/apache-storm-2.0.0/lib/hamcrest-core-1.3.jar:/data/apache-storm-2.0.0/lib/minlog-1.3.0.jar:/data/apache-storm-2.0.0/lib/slf4j-api-1.7.21.jar:/data/apache-storm-2.0.0/lib/log4j-core-2.11.1.jar:/data/apache-storm-2.0.0/lib/commons-exec-1.3.jar:/data/apache-storm-2.0.0/lib/storm-core-2.0.0.jar:/data/apache-storm-2.0.0/lib/jackson-core-2.9.4.jar:/data/apache-storm-2.0.0/lib/zookeeper-3.4.6.jar:/data/apache-storm-2.0.0/lib/commons-lang-2.6.jar:/data/apache-storm-2.0.0/lib/clojure-1.7.0.jar:/data/apache-storm-2.0.0/lib/metrics-core-3.2.6.jar:/data/apache-storm-2.0.0/lib/reflectasm-1.10.1.jar:/data/apache-storm-2.0.0/lib/commons-codec-1.11.jar:/data/apache-storm-2.0.0/lib/joda-time-2.3.jar:/data/apache-storm-2.0.0/extlib/*:target/2-1.0-SNAPSHOT.jar:/data/apache-storm-2.0.0/conf:/data/apache-storm-2.0.0/bin*
>> >> >>
>> >> >> This doesn't happen with Storm 1.2.2. Aren't these libs supposed to
>> be
>> >> >> shaded by Storm?
>> >> >>
>> >> >> Another issue is when I try to launch a topology from Eclipse (as I
>> was
>> >> >> able to do with Storm 1.x), even when adding
>> >> >>
>> >> >> *<dependency>*
>> >> >> * <groupId>org.apache.storm</groupId>*
>> >> >> * <artifactId>storm-server</artifactId>*
>> >> >> * <version>2.0.0</version>*
>> >> >> * </dependency>*
>> >> >> * <dependency>*
>> >> >> * <groupId>org.apache.storm</groupId>*
>> >> >> * <artifactId>storm-core</artifactId>*
>> >> >> * <version>2.0.0</version>*
>> >> >> * </dependency>*
>> >> >>
>> >> >> as suggested by
>> >> >> http://storm.apache.org/releases/2.0.0-SNAPSHOT/Local-mode.html,
>> there
>> >> >> seems to be an issue with ZK. The 2nd dependency is not mentioned on
>> >> that
>> >> >> page but seems to be needed.
>> >> >>
>> >> >> *16:50:53.041 [ProcessThread(sid:0 cport:-1):] INFO
>> >> >> o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level
>> KeeperException
>> >> when
>> >> >> processing sessionid:0x1668d05b0630007 type:create cxid:0x2
>> zxid:0x28
>> >> >> txntype:-1 reqpath:n/a Error
>> Path:/storm/blobstoremaxkeysequencenumber
>> >> >> Error:KeeperErrorCode = NoNode for
>> >> /storm/blobstoremaxkeysequencenumber*
>> >> >>
>> >> >> and the topology never starts. I could, of course, rely on "storm
>> >> local"
>> >> >> but being able to run a local topology without installing Storm is
>> >> quite
>> >> >> nice for users who just want to give it a try.
>> >> >>
>> >> >> Any thoughts?
>> >> >>
>> >> >> Julien
>> >> >>
>> >> >>
>> >> >> On Fri, 19 Oct 2018 at 16:40, Julien Nioche <
>> >> >> lists.digitalpeb...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> > Hi,
>> >> >> >
>> >> >> > I am testing Storm 2.0.0 with StormCrawler, not very successfully
>> >> >> >
>> >> >> > On Tue, 16 Oct 2018 at 20:48, P. Taylor Goetz <ptgo...@gmail.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >> This is a call to vote on releasing Apache Storm 2.0.0 (rc3)
>> >> >> >>
>> >> >> >> Full list of changes in this release:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc3/RELEASE_NOTES.html
>> >> >> >>
>> >> >> >> The tag/commit to be voted upon is v2.0.0:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=commit;h=d2d6f40344e6cc92ab07f3a462d577ef6b61f8b1
>> >> >> >>
>> >> >> >> The source archive being voted upon can be found here:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc3/apache-storm-2.0.0-src.tar.gz
>> >> >> >>
>> >> >> >> Other release files, signatures and digests can be found here:
>> >> >> >>
>> >> >> >>
>> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc3/
>> >> >> >>
>> >> >> >> The release artifacts are signed with the following key:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> >> >> >>
>> >> >> >> The Nexus staging repository for this release is:
>> >> >> >>
>> >> >> >>
>> >> https://repository.apache.org/content/repositories/orgapachestorm-1072
>> >> >> >>
>> >> >> >> Please vote on releasing this package as Apache Storm 2.0.0.
>> >> >> >>
>> >> >> >> When voting, please list the actions taken to verify the release.
>> >> >> >>
>> >> >> >> This vote will be open for at least 72 hours.
>> >> >> >>
>> >> >> >> [ ] +1 Release this package as Apache Storm 2.0.0
>> >> >> >> [ ]  0 No opinion
>> >> >> >> [ ] -1 Do not release this package because...
>> >> >> >>
>> >> >> >> Thanks to everyone who contributed to this release.
>> >> >> >>
>> >> >> >> -Taylor
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> > *Open Source Solutions for Text Engineering*
>> >> >> >
>> >> >> > http://www.digitalpebble.com
>> >> >> > http://digitalpebble.blogspot.com/
>> >> >> > #digitalpebble <http://twitter.com/digitalpebble>
>> >> >> >
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> *Open Source Solutions for Text Engineering*
>> >> >>
>> >> >> http://www.digitalpebble.com
>> >> >> http://digitalpebble.blogspot.com/
>> >> >> #digitalpebble <http://twitter.com/digitalpebble>
>> >> >>
>> >> >
>> >>
>> >
>> >
>> > --
>> >
>> > *Open Source Solutions for Text Engineering*
>> >
>> > http://www.digitalpebble.com
>> > http://digitalpebble.blogspot.com/
>> > #digitalpebble <http://twitter.com/digitalpebble>
>> >
>>
>>
>> --
>>
>> *Open Source Solutions for Text Engineering*
>>
>> http://www.digitalpebble.com
>> http://digitalpebble.blogspot.com/
>> #digitalpebble <http://twitter.com/digitalpebble>
>>
>

Reply via email to