[ https://issues.apache.org/jira/browse/SOLR-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729356#comment-17729356 ]
Jason Gerlowski commented on SOLR-16812: ---------------------------------------- I'm all for this ticket in general - I think moving *to* CBOR (or something like it) and *away* from javabin is great for our users (and for us as maintainers). But still, I was a little taken-aback to see this merged (and immediately backported) already. It seemed like there were (and still are) a few open questions to sort out from the discussion above. In particular: # *What does the CBOR performance look like generally?* As I mentioned above, "films.json" feels a little small to be testing this. Further, reading the JUnit-benchmark code you shared - I think it should be beefed up a bit before we use it as a benchmark, even anecdotally. (See forthcoming PR review comments.) Ishan mentioned he was hoping to soon share some WIP solr-bench code that would shed much more light on the performance picture. I was really looking forward to that! # *CBOR vs. Other Potential Formats* Can you elaborate at all on why you chose CBOR over other alternatives? (e.g. Eric Pugh mentioned Avro and Arrow, though of course others exist as well). Is CBOR better performance-wise? Was it chosen for some differentiating feature (e.g. streaming support)? For General popularity? etc. # *Javabin Deprecation/Replacement Plans* From your response above it sounds like we were on the same page that if we introduce a new binary format, then it should come with a plan to deprecate or replace javabin. Which is great! But I don't think your PR tackled that at all, and there's probably still some details to hash out around what that should look like. There's no need to revert here idt. But at the same time let's make sure these questions don't fall through the cracks just because the code is already "in". > Support CBOR format for update/query > ------------------------------------ > > Key: SOLR-16812 > URL: https://issues.apache.org/jira/browse/SOLR-16812 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Noble Paul > Assignee: Noble Paul > Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Javabin is quite efficient and fast . But non-java users have to use JSON > exclusively > > [CBOR |http://example.com/] is a widely used format that is supported by most > languages. > > Here is a benchmark of updating using CBOR vs. JSON our films.json > {code:java} > Payload Size (bytes) > ============ > > json : 633600 > cbor : 290672 > javabin: 234520 > time taken to index > ==================== > JSON: 583ms > CBOR: 509ms > JAVABIN : 549 > time takes to query *:* 1100 docs > ================================== > json: 92 ms > javabin : 70ms > cbor : 63ms{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org