[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647360#comment-13647360 ] Scott Carey commented on AVRO-1274: --- I am working on a modification to the builder that would make its use look like a json schema. {code} public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse( {\type\:\record\,\name\:\HandshakeRequest\,\namespace\:\org.apache.avro.ipc\,\fields\:[ {\name\:\clientHash\,\type\:{\type\:\fixed\,\name\:\MD5\,\size\:16}}, {\name\:\clientProtocol\,\type\:[\null\,{\type\:\string\,\avro.java.string\:\String\}]}, {\name\:\serverHash\,\type\:\MD5\}, {\name\:\meta\,\type\:[\null\,{\type\:\map\,\values\:\bytes\,\avro.java.string\:\String\}]} ]}); {code} becomes similar to: {code} public static final org.apache.avro.Schema SCHEMA$ = SchemaBuilder .typeRecord(HandshakeRequest).namespaceInherited(org.apache.avro.ipc).fields()// optional namespace inheritance .typeFixed(clientHash, MD5.SCHEMA$).field() // or typeFixed(clientHash, MD5, 16) .typeUnion(clientProtocol).ofNull().andString().withProp(avro.java.string, String).field() .typeFixed(serverHash, MD5).field() // uses reference to already defined MD5 .typeUnion(meta).ofNull().andMap().withProp(avro.java.string, String).valuesBytes().field() .record(); {code} we can also have shortcuts as before, for example optionalInt(x, -1) as a shortcut for typeUnion(x).ofInt(-1).andNull() nullableInt(maybe) as a shortcut for typeUnion(maybe).ofNull(null).andInt() requiredInt(yes) may not be necessary, its shortcut would be typeInt(yes).field(); It should be straightforward to implement the whole Schema.Parser with the above (and simplify the parser), which makes it easy to test very thoroughly; there is an intentional 1:1 mapping between the parser, spec, and the builder. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Fix For: 1.7.5 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647629#comment-13647629 ] Tom White commented on AVRO-1274: - I'm slightly reluctant to add lots of overloaded methods (as I mentioned above), since it makes the builder much harder to use in an IDE with autocompletion. Will the user be able to see the difference between optionalInt and nullableInt? Or requiredInt and typeInt? A way to specify properties is missing so we should add that. Let's discuss this and other changes in new JIRAs. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Fix For: 1.7.5 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648048#comment-13648048 ] Scott Carey commented on AVRO-1274: --- I am planning on constraining the lexical scope via many cascaded builders / assemblers so that the list to auto-complete at any time is small. I'll make a new JIRA for my proposed changes. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Fix For: 1.7.5 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647196#comment-13647196 ] Scott Carey commented on AVRO-1274: --- We may have more work to do here. How would you use the builder to do the equivalent of: {code} public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse( {\type\:\record\,\name\:\HandshakeRequest\,\namespace\:\org.apache.avro.ipc\,\fields\:[ {\name\:\clientHash\,\type\:{\type\:\fixed\,\name\:\MD5\,\size\:16}}, {\name\:\clientProtocol\,\type\:[\null\,{\type\:\string\,\avro.java.string\:\String\}]}, {\name\:\serverHash\,\type\:\MD5\}, {\name\:\meta\,\type\:[\null\,{\type\:\map\,\values\:\bytes\,\avro.java.string\:\String\}]} ]}); {code} ? I am trying to suggest that we replace literal strings with the builder in AVRO-1316 but cannot seem to repliate the above with the builder. The clientProtocol and meta fields are the problem. It does not seem possible to create a union of null and 'more' without a default. Additionally, unionType is confusing. Is this how it would be done? If so, I do not see how to add types to the union if I start with: {code} unionType(clientProtocol, SchemaBuilder.NULL) {code} Then how do I add extra types? Or is the type passed in expected to _be_ a union? if so the field should be named unionSchema and the javadoc needs to be clear. This builder API makes it hard to create union fields without defaults. Perhaps it is simply a documentation issue and the doc for unionType() needs an example. Should we open a new ticket for these concerns or re-open this one? I suspect it is largely documentation but am not sure. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Fix For: 1.7.5 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647221#comment-13647221 ] Scott Carey commented on AVRO-1274: --- I think the answer to my question would be: {code} public static final org.apache.avro.Schema SCHEMA$; static { SCHEMA$ = SchemaBuilder .recordType(HandshakeRequest) .namespace(org.apache.avro.ipc) .requiredFixed(clientHash, MD5.SCHEMA$) .unionType(clientProtocol, SchemaBuilder.unionType( SchemaBuilder.NULL, SchemaBuilder.STRING) .build()) .addFieldProp(avro.java.string, String) .requiredFixed(serverHash, MD5.SCHEMA$) .unionType(meta, SchemaBuilder.unionType( SchemaBuilder.NULL, SchemaBuilder.mapType(SchemaBuilder.BYTES) .addFieldProp(avro.java.string, String) .build()) .build()) .build(); } {code} but I am not sure. Also addFieldProp() does not exist. What is odd is that there are two unionType() methods, one takes varargs and the other does not. I suspect that the intention was for both to use varargs so that the nested union building is not required by the user. It would be much simpler if unions without defaults had a shortcut: {code} public static final org.apache.avro.Schema SCHEMA$; static { SCHEMA$ = SchemaBuilder .recordType(HandshakeRequest) .namespace(org.apache.avro.ipc) .requiredFixed(clientHash, MD5.SCHEMA$) .nullableString(clientProtocol) .addFieldProp(avro.java.string, String) .requiredFixed(serverHash, MD5.SCHEMA$) .nullableMap(SchemaBuilder.BYTES) .addFieldProp(avro.java.string, String) .build() } {code} Building unions in general feels clunky as well since you have to break chaining and use SchemaBuilder again. Instead of taking a varargs list of schemas in the union, the type returned could be a UnionBuilder. So instead of: {code} public static final org.apache.avro.Schema SCHEMA$; static { SCHEMA$ = SchemaBuilder .recordType(Test) .namespace(org.apache.avro) .unionString(stringField, defaultVal, SchemaBuilder.INT, SchemaBuilder.arrayType(SchemaBuilder.INT).build() SchemaBuilder.mapType(SchemaBuilder.unionType( SchemaBuilder.INT, SchemaBuilderLONG) ) ) .build() } {code} we could write something more like: {code} public static final org.apache.avro.Schema SCHEMA$; static { SCHEMA$ = SchemaBuilder .recordType(Test) .namespace(org.apache.avro) .unionString(stringFieldName, defaultVal) .andInt() .andArrayOf().int() .andMapOf().unionInt().andLong() .build() } {code} Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Fix For: 1.7.5 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644503#comment-13644503 ] Hudson commented on AVRO-1274: -- Integrated in AvroJava #367 (See [https://builds.apache.org/job/AvroJava/367/]) AVRO-1274. Java: Add a schema builder API. (Revision 1476973) Result = SUCCESS tomwhite : Files : * /avro/trunk/CHANGES.txt * /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/Schema.java * /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/SchemaBuilder.java * /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/SchemaBuilderException.java * /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java * /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericRecordBuilder.java * /avro/trunk/lang/java/avro/src/test/java/org/apache/avro/TestSchemaBuilder.java * /avro/trunk/lang/java/avro/src/test/resources/SchemaBuilder.avsc Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Fix For: 1.7.5 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642247#comment-13642247 ] Scott Carey commented on AVRO-1274: --- +1 Yes, looks good! Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640961#comment-13640961 ] Tom White commented on AVRO-1274: - Scott, are you OK for this to be committed now? Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639516#comment-13639516 ] Scott Carey commented on AVRO-1274: --- This looks good. Minor nit: perhaps change defaultValue( val) to default(val) for brevity and alignment with the name of the property in json. Minor concern: How does this API deal with names that are the full name? For example, the two below should be the same: {code} SchemaBuilder.recordType(myrecord).namespace(org.example).build(); SchemaBuilder.recordType(org.example.myrecord).build() {code} But we should document the behavior when mixing the two: {code} SchemaBuilder.recordType(org.example1.myrecord).namespace(org.example2).build(); {code} It would be nice if the builder API behaved consistent with the schema parser when provided similar information: {type: record, name:org.example1.myrecord, namespace:org.example2} In part because if the builder API was in sync with the parser, we could use it in the parser, simplifying the parser and making behavior consistent. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639554#comment-13639554 ] Doug Cutting commented on AVRO-1274: 'default' is a reserved word in Java and cannot be used as a method name. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609230#comment-13609230 ] Tom White commented on AVRO-1274: - nullable with null default, nullable with no default, nullable with a value default, non-nullable with a default, and non-null without a default This could be confusing! I think we need to make the common cases accessible and easy to understand. Required, optional, and optional with a default are all common cases. The other two (nullable with no default, and to a lesser extent non-nullable with a default) are not, so we need to work out a way of exposing them (if we expose them at all at the moment) that makes sense in the context of IDE autocomplete, which is how I think this API will be experienced. One renaming might be the following, but I'm not sure what I think about it. {noformat} intType(name) intType(name, default) nullableIntType(name) nullableIntType(name, default) nullableIntTypeNoDefault(name) {noformat} Another way would be to leave the naming we have, and offer an escape hatch for advanced users, {{SchemaBuilder.recordType(r).field(f0)...}} with the advanced methods. One thing I do want to avoid is excessive chaining, since if you have something like {{name(foo).nullable().int()}} then it's not clear to users what parts of the field definition are optional (e.g. nullable is but the type isn't). This is why I prefer the overloaded variants of requiredX/optionalX. Regarding enforcing the default in union types, the following change to the API should do it: {noformat} Schema schema = SchemaBuilder.recordType(r) .unionLong(myunion).withType(SchemaBuilder.NULL).build(); {noformat} or {noformat} Schema schema = SchemaBuilder.recordType(r) .unionLong(myunion, 7L).withType(SchemaBuilder.INT).build(); {noformat} I'll create a patch for that while we decide what to do about the optional/nullable API. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607768#comment-13607768 ] Doug Cutting commented on AVRO-1274: Schema, Field, Protocol and Message do actually have a common base class: http://avro.apache.org/docs/current/api/java/org/apache/avro/JsonProperties.html I'm not sure how much this can be exploited to simplify generic traversal. It would be nice to have a generic traversal API. I've started to write one several times but given up since it was far easier in each case to write another recursive walker with a switch statement. I believe that Tom's API is sufficiently independent of the underlying Schema API that it can survive changes to that. I'd hate to see the addition of this much-needed builder API held back for a re-design of the Schema API. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607846#comment-13607846 ] Scott Carey commented on AVRO-1274: --- I agree, don't hold this up. It appears to be the proper abstraction for the job: it does not leak implementation details and is more a Java definition of the Schema spec. For example: {code} public FieldBuilder optionalInt(String name, int defaultValue) { return new FieldBuilder(this, name, INT, true, toJsonNode(defaultValue)); } {code} does not leak the JsonNode stuff out to the api, and requires that the default value is the proper type. There may be some more work to do to reach all parts of the spec or aid ease of use (perhaps in another ticket), but if all uses are spec-compatible and type-safe, then it is extremely unlikely we'll need an API change to this at any point in the future unless it involves a corresponding spec change. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607952#comment-13607952 ] Tom White commented on AVRO-1274: - Thanks for taking a look Scott. I agree that over time the builder API can be used as a replacement to hide the problems with the existing Schema API from users. Regarding the required field with default value - I'll add that. Also, we could check the union's first type is consistent with any default, but I can't see a way of getting it to be a compile-time check - we'd have to do it when the schema is built. I can make these changes in this JIRA or another one - either way works for me. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608218#comment-13608218 ] Scott Carey commented on AVRO-1274: --- Another type of schema that the builder cannot create (easily) is an optional field with no default. Such a schema is brittle when used to read, but in some cases that is desired -- you may want to fail if the data being read does not contain a field or matching value at the source. optional and required don't feel like the right names -- the latter is not required, if it has a default value, and the former may be required if it does not have a default value. nullable is a more exact description for the former. This means there are 5 methods per type if we keep the builder similar -- nullable with null default, nullable with no default, nullable with a value default, non-nullable with a default, and non-null without a default. A different way to handle this is to move the default handling to a specialized field builder per type (type-builder?) rather than have the method count be combinatorial (5 + N methods rather than 5 * N methods, for N types and 5 default options). This is the same code that would be required to make union defaults type-safe (When building a union, the first type would have to be added explicitly and return the appropriate default builder, then other types could be added to the union). We could split it into enough types to make it more composable. Below are some ideas that I haven't thought through completely, and I might take a stab at it in 4 weeks: {code} nullableInt(foo).default(1); // for nullable int (a union of null and int, which is ordered properly based on whether it has a non-null default) nullableInt(foo).defaultNull(); // null default, if missing on read the field is null nullableInt(foo).required(); // no default value, the field is required int(foo).default(-1); // for non-nullable int with default -1; int(foo).required(); {code} or completely chained syntax for each step (which requires several more builder types but can be perfectly type safe): {code} name(foo).nullable().int().default(1); // capture name separately, since we want to build types without names elsewhere and those have the same API otherwise name(foo).nullable().int().nullDefault(); name(foo).nullable().int().required(); name(foo).int().default(-1); name(foo).int().required(); name(foo).arrayOf().int().nullable().default(new int[] {0}); // re-use type building for fields for array inner type name(foo).unionOf().fixed(4).default(new byte[] {127, 0, 0, 1}).and().fixed(16); // re-use type building again, and also only allow the first one to be a type builder that supports defaults, the type builder after add() does not support defaults. We cant prevent unions from adding the same type twice at this point without making a type for every combinational subset of unnamed types, due to limitations with Java's type system. // complex example new RecordTypeBuilder(org.apache.avro.example.Tree) .field(left).nullable().recordReference(org.apache.avro.example.Tree).defaultNull() .field(data).string().required() .field(right).nullable().recordReference(org.apache.avro.example.Tree).defaultNull() .build(); {code} nullable() is a special case union {code} field(foo).nullable().int().defaultNull() // a special case binary union of null and a single other type field(foo).unionOf().null().and().int().defaultNull(); // same, but allows for adding more than one additional type to the union and does not support rearranging the order of the two for default purposes {code} Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603315#comment-13603315 ] Tom White commented on AVRO-1274: - I'm wondering if the correct way to do this is actually to have [null, T] for optional fields with no default: {name: optionalBoolean, type: [ null, boolean ], default: null} and [T, null] when there is a non-null default: {name: optionalBooleanWithDefault, type: [ boolean, null ], default : true} Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603636#comment-13603636 ] Doug Cutting commented on AVRO-1274: I'm wondering if the correct way to do this is actually to have [null, T] for optional fields with no default [ ... ] and and [T, null] when there is a non-null default. The latter is certainly required when there is a non-null default. The former is subtly different. A reader with a [null, T] union with no default value specified still requires that the field be present in the writer's schema. So it's a required nullable field as opposed to an entirely optional field. This subtlety is confusing, so glossing over it in the builder API by always generating a default value of null for nullable fields with no other default value specified is probably best. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603884#comment-13603884 ] Doug Cutting commented on AVRO-1274: Some nits: - If a default value has a nested bytes then it will fail. For example, a field whose type is a record with a field named 'a' of type bytes can have a default value of {a:asdf}, but GenericData.toString() won't generate this correctly. I think this can just remain a known issue until we fix GenericData.toString(), but we should probably add a comment noting that. - Is SchemaParseException the right exception here? AvroRuntimeException or perhaps some new exception like SchemaBuilderError or somesuch. Other than that, this looks great! +1 Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602459#comment-13602459 ] Josh Wills commented on AVRO-1274: -- Hey Tom-- I am of no help on the bytes default values problem, I just wanted to say that I love the new API. :) Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602567#comment-13602567 ] Doug Cutting commented on AVRO-1274: This should look like: {name:optionalBytesWithDefault, type:[null, bytes], default:null} If a field's type is a union, then the type of the default is the type of the first element in the union. So the only valid default value for a union of the form [null, ...] is null. Some other valid examples of unions with defaults are: {name:f1, type:[string, int], default:} {name:f2, type:[int, string], default:0} Default values are different than what JsonEncoder would produce. It will qualify values of a union with their type, rendering {bytes:foo} rather than just foo for a value whose schema is [bytes, ...]. But default values are not so qualified. Does that help? Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602613#comment-13602613 ] Doug Cutting commented on AVRO-1274: Sorry, I wrote the above before looking at your patch and the sources. That {bytes:foo} thing is indeed coming from GenericData#toString. (It dates back to the pre-history of Avro. I must have had some good intention when I added it, but it sure looks evil now.) We should probably remove it, but that would be an incompatible change. Perhaps the next release should be 1.8.0 instead of 1.7.5. There are a few other minor incompatible changes queued that would be nice to get out. Or we can work around this, specially handling binary default values. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602789#comment-13602789 ] Doug Cutting commented on AVRO-1274: Defaults are primarily used at read time to supply values for fields missing from the writer's schema. The builder API will also fill in default values at object creation time (i.e., prior to write, typically). To build generic instances with defaults use GenericRecordBuilder. For example: with the schema: {code} {type:record, name:r, fields:[{name:f, type:int, default:0}]} {code} then you should see: {code} new GenericRecordBuilder(schema).build().toString() - {f, 0} new GenericRecordBuilder(schema).set(f,1).build().toString() - {f, 1} {code} Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601319#comment-13601319 ] Doug Cutting commented on AVRO-1274: This looks great, Tom, and has long been needed. - The downside of calling this Schema.Builder is that it makes the Schema class even bigger. The upside is that if you 'import Schema.Builder' then the code is sleeker. But perhaps the preferred import should instead be 'import static SchemaBuilder.*'? The static methods have unique-enough names that this might work well. What do you think? - We can convert from Java object to JsonNode by parsing the output of GenericData.toString(Object). Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1274) Add a schema builder API
[ https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600838#comment-13600838 ] Josh Wills commented on AVRO-1274: -- Hey Tom-- I wrote something along these lines way back in the day: https://github.com/jwills/avroplay/blob/master/src/com/randomgraphs/avro/RecordSchemaBuilder.java The general orientation is towards supporting the union { null, T } pattern for optional fields w/default values, and it ends up looking like: Schema schema = new RecordSchemaBuilder(myrecord) .requiredString(foo) .optionalFloat(bar, 17.29f) .array(baz, Schema.create(Schema.Type.STRING)) .build(); It has support for default values for primitive types and just wraps them in JsonNodes as need be, and is smart about checking to see if your record is named or anonymous. I'm happy to re-format it as a patch if you think it's worthwhile. My main feeling was that the name, type, and required/optional nature of the field are the three things you really always have to know, and whether/not you have a doc string or sort order info should be hidden away as rarely-used options in this context. Add a schema builder API Key: AVRO-1274 URL: https://issues.apache.org/jira/browse/AVRO-1274 Project: Avro Issue Type: New Feature Components: java Reporter: Tom White Assignee: Tom White Attachments: AVRO-1274.patch It would be nice to have a fluent API that made it easier to construct record schemas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira