There are at least 3 RPC compatibility-breaking changes in htrace-htraced between 4.0.1 and 4.1.0:
HTRACE-315 changed the default port for htraced's HTTP interface from 9095 to 9096. HTRACE-237 changes the HTTP wire format slightly for htraced. Previous to this, we just sent a whitespace-separated list of trace spans. After this, we send an actual JSON object. HTRACE-308 (Deserialize WriteSpans requests incrementally rather than all at once) changes the field "Spans" in the HRPC header to "NumSpans". It also decreases the maximum RPC size, MAX_HRPC_BODY_LENGTH, from 64 MB to 32 MB. To be honest, the main point of 4.0.1 was to stabilize the htrace-core4 API and make a first release with the GUI. There was a lot of unfinished business in htraced-- things that we only got to 4.1. htraced is way more stable in 4.1 since we dealt with things like GC pressure, the client side, and so forth. And also just fixing bugs. I think we should accept the compatibility break with 4.0.1. However, I agree that it would be nice to support the "old client, new server" case. I filed HTRACE-344 to add a mechanism that makes it easier to detect the case where the client is too new, and give back a reasonable response. I don't think we should adopt a formal compatibility policy-- htraced is not mature enough for that. But we can strive to maintain compatibility harder than we have in 4.2 (or whatever the next release ends up being.) best, Colin On Mon, Feb 22, 2016 at 10:37 AM, Stack <[email protected]> wrote: > On Mon, Feb 22, 2016 at 9:51 AM, Colin P. McCabe <[email protected]> wrote: > >> On Sun, Feb 21, 2016 at 8:10 PM, Stack <[email protected]> wrote: >> > "The rationale for this limitation is that tracing can simply be disabled >> > for a brief period during the rolling upgrade process." >> > >> > The second time an operator has to do this, they'll just throw away >> tracing >> > as a PITA. >> >> Let's put this in perspective. Apache Spark recently transitioned >> from Scala 2.10 to 2.11. Those Scala releases aren't binary >> compatible. They aren't even source-code compatible, which means that >> people potentially had to rewrite their Spark jobs just to perform an >> upgrade... let alone have things continue to work during the upgrade. >> And people didn't throw away Spark; it's more popular than ever. >> >> > By 'perspective', you mean others made a mess so we can too? > > > >> Now: It makes sense for a storage system (particularly a mature and >> widely-deployed one) to bend over backwards to stay up during >> upgrades. That's why HDFS is so strict about this, and HBase as well. >> But they weren't always that strict; we used to break RPC >> compatibility with every release in the earlier days. Also, HTrace is >> not a storage system! It's a tracing system. It can be unavailable >> for a few hours. It will be OK. >> >> What's not OK is for us to have CLASSPATH conflicts within a minor >> version of htrace-core4 that will create problems during rolling >> upgrades. It is not OK to remove APIs in htrace-core4, and so forth >> and so on. We thought about this stuff very carefully when coming up >> with the compatibility policy to determine what was acceptable and >> what was not. >> >> > I'm not talking about CLASSPATH, I'm talking about the sending of spans. > > Do you know for sure that a 4.0.1 client can't talk to a 4.1.0-based sink? > If so, do you know what is broke? > > Thanks Colin, > St.Ack > > > > >> As the project matures, we can think about adopting a more generous >> (and much more difficult to implement) compatibility policy on a >> component-by-component basis. But currently, a lot of the components >> are not very mature. For example, htrace-flume has seen very little >> development. htrace-htraced is probably the most mature component, >> but a lot of its features are new. 4.0.1 was the first release with a >> real GUI and usable client support. Trying to implement an HDFS or >> HBase-style compatibility policy right now would slow down development >> greatly, for no gain. >> >> best, >> Colin >> >> >> > >> > Tracing needs to bend to serve the traced systems, not the other way >> around. >> > >> > A 4.0.1 can't talk to a 4.1.0? Do you know how it is broken? >> > >> > St.Ack >> > >> > >> > On Fri, Feb 19, 2016 at 5:17 PM, Colin P. McCabe <[email protected]> >> wrote: >> > >> >> Our compatibility policy (see >> >> >> >> >> http://mail-archives.apache.org/mod_mbox/htrace-dev/201509.mbox/%[email protected]%3E >> >> ) only covers the htrace-core4 API right now. So we can guarantee >> >> that any projects using htrace-core 4.0.1 can upgrade to htrace-core >> >> 4.1.0 without breaking anything. (This is a more painful guarantee >> >> than it sounds since it means we can't remove functions, only >> >> deprecate them... And so forth.) But it's a very useful guarantee >> >> for our downstream projects. >> >> >> >> However, we don't support mixing and matching versions of the >> >> SpanReceiver client and server components. The admin has to roll out >> >> a uniform version of those components-- for example, using htraced >> >> 4.0.1 with htrace-htraced.jar 4.1.0 is not supported. The rationale >> >> for this limitation is that tracing can simply be disabled for a brief >> >> period during the rolling upgrade process. Also, the different >> >> SpanReceiver subprojects are at different levels of maturity, and >> >> imposing heavy compatibility guarantees would slow down development >> >> for no real gain. >> >> >> >> best, >> >> Colin >> >> >> >> >> >> On Fri, Feb 19, 2016 at 4:32 PM, Stack <[email protected]> wrote: >> >> > Can a 4.0.1 client talk to a 4.1.0 htrace? Has it been tested? >> >> > St.Ack >> >> > >> >> > On Tue, Feb 9, 2016 at 7:00 PM, Colin P. McCabe <[email protected]> >> >> wrote: >> >> > >> >> >> Hi all, >> >> >> >> >> >> I've posted the second release candidate for HTrace 4.1 here: >> >> >> >> >> >> http://people.apache.org/~cmccabe/htrace/releases/4.1.0/rc2/ >> >> >> >> >> >> The jars have been staged here: >> >> >> >> >> >> >> https://repository.apache.org/content/repositories/orgapachehtrace-1022 >> >> >> >> >> >> Compared to rc1, this rc includes HTRACE-334 and HTRACE-342. >> >> >> >> >> >> HTrace 4.1 brings a lot of robustness improvements. There were major >> >> >> improvements to htraced and the web UI, as well as new metrics added. >> >> >> There were numerous build fixups, and we added Docker support, to >> >> >> ensure a repeatable build. >> >> >> >> >> >> Check it out. The vote will run for 5 days. >> >> >> >> >> >> cheers, >> >> >> Colin >> >> >> >> >> >> >> >> >> Release Notes - HTrace - Version 4.1 >> >> >> ** Bug >> >> >> * [HTRACE-114] - Fix compilation error of htrace-hbase against >> >> >> hbase-1.0.0 >> >> >> * [HTRACE-238] - Change maven compiler source level to 1.7 to >> >> >> match targetJdk >> >> >> * [HTRACE-243] - Remove duplicate maven-assembly-plugin >> >> >> configuration section in htrace-htraced/pom.xml >> >> >> * [HTRACE-245] - NOTICE.txt: change "developed by The Apache >> >> >> Software...” to "developed at The Apache Software...” >> >> >> * [HTRACE-246] - HTrace WebApp not properly defined and therefore >> >> >> not packaged into .war >> >> >> * [HTRACE-248] - HTraced should gracefully shutdown if stopped >> >> >> * [HTRACE-249] - Script and doc on how to publish website >> >> >> * [HTRACE-251] - Fix "mvn clean" target >> >> >> * [HTRACE-253] - Tracer loadSamplers and loadSpanReceivers logs >> >> >> are too chatty >> >> >> * [HTRACE-256] - Change the artifactId for htrace-core in branch >> >> >> 4.0 to be htrace-core4 >> >> >> * [HTRACE-257] - htrace-htraced: add web symlink rather than >> >> >> generating programmatically >> >> >> * [HTRACE-262] - Temporarily suppress doclint for Java 8 to >> >> >> prevent build failure >> >> >> * [HTRACE-266] - Make the CLIENT_REST_MAX_SPANS_AT_A_TIME_KEY >> >> >> config key more consistent with other configs >> >> >> * [HTRACE-267] - Move owl logo licensing information from NOTICE >> to >> >> >> LICENSE >> >> >> * [HTRACE-268] - Remove Units and go-codec from LICENSE since >> they >> >> >> are not contained in the source release >> >> >> * [HTRACE-272] - TracerPool must not load multiple inscance of >> >> >> same receiver class when a simple classname is given >> >> >> * [HTRACE-279] - Fix issues where the HTracedSpanReceiver was >> >> >> using the wrong JSON serialization for spans and add validation to >> >> >> htraced REST ingest path >> >> >> * [HTRACE-280] - htraced: add metrics about total spans added and >> >> >> dropped per address >> >> >> * [HTRACE-281] - htraced: add example/htraced-conf.xml >> >> >> * [HTRACE-282] - htraced: reap spans which are older than a >> >> >> configurable interval >> >> >> * [HTRACE-283] - Heartbeater should wait for goroutine to finish >> on >> >> >> close >> >> >> * [HTRACE-284] - htrace-htraced, htrace-flume: do not treat the >> >> >> shaded version of commons-logging as provided >> >> >> * [HTRACE-285] - htraced tool: fix query parsing and add >> query_test >> >> >> * [HTRACE-289] - Fix TraceEnabled, etc. logger methods for >> >> >> conditional logging >> >> >> * [HTRACE-294] - htraced: fix some metrics issues >> >> >> * [HTRACE-297] - htraced: avoid serializing spans to json unless >> >> >> TRACE logging is enabled >> >> >> * [HTRACE-300] - Reaper should be initialized before shards are >> >> >> activated >> >> >> * [HTRACE-301] - htraced: fix unit tests that aren't waiting for >> >> >> spans to be written, use semaphore for WrittenSpans >> >> >> * [HTRACE-302] - htraced: Add admissions control to HRPC to limit >> >> >> the number of incoming messages >> >> >> * [HTRACE-304] - htraced: fix bug with GREATER_THAN queries >> >> >> * [HTRACE-307] - htraced: queries sometimes return no results >> even >> >> >> when many results exist due to confusion in iterator usage >> >> >> * [HTRACE-311] - htraced: Fix logging to stdout via -Dlog.path= >> >> >> * [HTRACE-316] - htrace-web: span.js issue: span ID string length >> >> >> is 32, not 36 >> >> >> * [HTRACE-317] - Fix the documentation for adding tracing to an >> >> >> application to reflect HTrace 4.x API changes >> >> >> * [HTRACE-328] - htraced continues scanning in some cases even >> >> >> when no more results are possible >> >> >> >> >> >> ** Improvement >> >> >> * [HTRACE-342] - centralize building instructions in BUILDING.txt >> >> >> * [HTRACE-334] - htrace-web: Make limit of search and children >> API >> >> >> configurable >> >> >> * [HTRACE-129] - htraced: add /server/stats REST endpoint >> >> >> * [HTRACE-156] - HTrace GUI: add about view >> >> >> * [HTRACE-181] - gui: Split "about" screen >> >> >> * [HTRACE-237] - Optimize htraced span receiver >> >> >> * [HTRACE-239] - Add htrace/impl/TestZipkinSpanReceiver.java >> >> >> * [HTRACE-260] - htrace-zipkin should not set the obsolete >> >> >> duration field in thrift >> >> >> * [HTRACE-271] - Add log4j.properties to all submodule tests >> >> >> * [HTRACE-276] - Shade classes into org.apache.htrace.shaded >> >> >> rather than org.apache.htrace >> >> >> * [HTRACE-286] - htraced: improvements to logging, daemon >> startup, >> >> >> and configuration >> >> >> * [HTRACE-290] - htraced: Fix per-faculty log level settings and >> >> >> add unit tests for conditional logging >> >> >> * [HTRACE-291] - rename bin/htrace to bin/htracedTool >> >> >> * [HTRACE-292] - "htracedTool version" should display the git >> >> >> hash, and -Dgit.version option should be available for build >> >> >> * [HTRACE-295] - htraced: setting span.expiry.ms to 0 should >> >> >> disable span expiry >> >> >> * [HTRACE-296] - htraced tests: make sure local settings for >> >> >> HTRACED_WEB_DIR and HTRACE_CONF_DIR don't affect unit tests >> >> >> * [HTRACE-298] - htraced: improve datastore serialization and >> >> metrics >> >> >> * [HTRACE-303] - Add client-side htraceDropped log file to track >> >> >> dropped spans >> >> >> * [HTRACE-305] - htrace-web: Use greater-than-or-equal rather >> than >> >> >> greater-than in more places >> >> >> * [HTRACE-306] - htraced: logs should use UTC >> >> >> * [HTRACE-308] - Deserialize WriteSpans requests incrementally >> >> >> rather than all at once to optimize GC >> >> >> * [HTRACE-310] - htracedTool: when there is an error response, >> >> >> print the body of the response >> >> >> * [HTRACE-312] - htraced: if GOMAXPROCS is left at 1, set it to >> >> >> the number of CPUs >> >> >> * [HTRACE-313] - htraced span receiver clientDropped file should >> >> >> include timestamps >> >> >> * [HTRACE-314] - htraced: make datastore loading safer >> >> >> * [HTRACE-327] - HTRACE-327: improve htraced command-line parsing >> >> >> and add version command >> >> >> * [HTRACE-334] - htrace-web: Make limit of search and children >> API >> >> >> configurable >> >> >> * [HTRACE-335] - htrace-web: Adjust size of span widget >> >> >> * [HTRACE-339] - Major type in htrace-flume README >> >> >> >> >> >> ** New Feature >> >> >> * [HTRACE-235] - htrace-zipkin - add Kafka transport support >> >> >> * [HTRACE-277] - htraced: Add /server/conf endpoint to get server >> >> >> configuration >> >> >> * [HTRACE-278] - htraced: dump thread stacks and GC statistics >> >> >> when SIGQUIT is sent >> >> >> * [HTRACE-288] - htraced: Add a user interface to view server >> >> >> version, metrics, and configuration >> >> >> * [HTRACE-293] - htrace-web: control-click should fully expand >> trace >> >> >> trees >> >> >> * [HTRACE-299] - htraced: add /server/debugInfo REST endpoint to >> >> >> get stack traces and GC stats >> >> >> * [HTRACE-309] - htraced: improve leveldb configuration >> >> >> * [HTRACE-323] - htrace-web: change the cursor to a spinner while >> >> >> a search is in progress >> >> >> * [HTRACE-332] - htraced: optionally enable leveldb LRU cache >> >> >> >> >> >> ** Task >> >> >> * [HTRACE-241] - Docker image for HTrace >> >> >> * [HTRACE-315] - htraced: change default web port from 9095 to >> 9096 >> >> >> * [HTRACE-319] - mark versions 4.0 and 4.0.1 as released >> >> >> * [HTRACE-331] - create git tags for 4.0 and 4.0.1 releases >> >> >> >> >> >> ** Wish >> >> >> * [HTRACE-269] - HTraceConfiguration support to get the map of >> >> >> configurations >> >> >> >> >> >>
