[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-05-02 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647360#comment-13647360
 ] 

Scott Carey commented on AVRO-1274:
---

I am working on a modification to the builder that would make its use look like 
a json schema.

{code}
 public static final org.apache.avro.Schema SCHEMA$ = new 
org.apache.avro.Schema.Parser().parse(
  
{\type\:\record\,\name\:\HandshakeRequest\,\namespace\:\org.apache.avro.ipc\,\fields\:[

{\name\:\clientHash\,\type\:{\type\:\fixed\,\name\:\MD5\,\size\:16}},

{\name\:\clientProtocol\,\type\:[\null\,{\type\:\string\,\avro.java.string\:\String\}]},
{\name\:\serverHash\,\type\:\MD5\},

{\name\:\meta\,\type\:[\null\,{\type\:\map\,\values\:\bytes\,\avro.java.string\:\String\}]}
  ]});
{code}

becomes similar to:

{code}
  public static final org.apache.avro.Schema SCHEMA$ = SchemaBuilder

.typeRecord(HandshakeRequest).namespaceInherited(org.apache.avro.ipc).fields()//
 optional namespace inheritance
  .typeFixed(clientHash, MD5.SCHEMA$).field()   // or 
typeFixed(clientHash, MD5, 16)
  
.typeUnion(clientProtocol).ofNull().andString().withProp(avro.java.string, 
String).field()
  .typeFixed(serverHash, MD5).field() // uses reference to already 
defined MD5
  .typeUnion(meta).ofNull().andMap().withProp(avro.java.string, 
String).valuesBytes().field()
.record();
{code}

we can also have shortcuts as before, for example
optionalInt(x, -1) as a shortcut for typeUnion(x).ofInt(-1).andNull()

nullableInt(maybe) as a shortcut for typeUnion(maybe).ofNull(null).andInt()

requiredInt(yes) may not be necessary, its shortcut would be 
typeInt(yes).field();

It should be straightforward to implement the whole Schema.Parser with the 
above (and simplify the parser), which makes it easy to test very thoroughly; 
there is an intentional 1:1 mapping between the parser, spec, and the builder.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Fix For: 1.7.5

 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-05-02 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647629#comment-13647629
 ] 

Tom White commented on AVRO-1274:
-

I'm slightly reluctant to add lots of overloaded methods (as I mentioned 
above), since it makes the builder much harder to use in an IDE with 
autocompletion. Will the user be able to see the difference between optionalInt 
and nullableInt? Or requiredInt and typeInt?

A way to specify properties is missing so we should add that. Let's discuss 
this and other changes in new JIRAs.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Fix For: 1.7.5

 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-05-02 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648048#comment-13648048
 ] 

Scott Carey commented on AVRO-1274:
---

I am planning on constraining the lexical scope via many cascaded builders / 
assemblers so that the list to auto-complete at any time is small.

I'll make a new JIRA for my proposed changes.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Fix For: 1.7.5

 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-05-01 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647196#comment-13647196
 ] 

Scott Carey commented on AVRO-1274:
---

We may have more work to do here. 

How would you use the builder to do the equivalent of:

{code}
  public static final org.apache.avro.Schema SCHEMA$ = new 
org.apache.avro.Schema.Parser().parse(
  
{\type\:\record\,\name\:\HandshakeRequest\,\namespace\:\org.apache.avro.ipc\,\fields\:[

{\name\:\clientHash\,\type\:{\type\:\fixed\,\name\:\MD5\,\size\:16}},

{\name\:\clientProtocol\,\type\:[\null\,{\type\:\string\,\avro.java.string\:\String\}]},
{\name\:\serverHash\,\type\:\MD5\},

{\name\:\meta\,\type\:[\null\,{\type\:\map\,\values\:\bytes\,\avro.java.string\:\String\}]}
  ]});
{code}

?

I am trying to suggest that we replace literal strings with the builder in 
AVRO-1316 but cannot seem to repliate the above with the builder.

The clientProtocol and meta fields are the problem.  It does not seem 
possible to create a union of null and 'more' without a default.

Additionally, unionType is confusing.  Is this how it would be done?  If so, 
I do not see how to add types to the union if I start with:

{code}
unionType(clientProtocol, SchemaBuilder.NULL)
{code}
Then how do I add extra types?  Or is the type passed in expected to _be_ a 
union?  if so the field should be named unionSchema and the javadoc needs to be 
clear.

This builder API makes it hard to create union fields without defaults.  
Perhaps it is simply a documentation issue and the doc for unionType() needs an 
example.  

Should we open a new ticket for these concerns or re-open this one?  I suspect 
it is largely documentation but am not sure.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Fix For: 1.7.5

 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-05-01 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13647221#comment-13647221
 ] 

Scott Carey commented on AVRO-1274:
---

I think the answer to my question would be:

{code}
  public static final org.apache.avro.Schema SCHEMA$;
  static {
SCHEMA$ = SchemaBuilder
  .recordType(HandshakeRequest)
  .namespace(org.apache.avro.ipc)
  .requiredFixed(clientHash, MD5.SCHEMA$)
  .unionType(clientProtocol, SchemaBuilder.unionType(
  SchemaBuilder.NULL,
  SchemaBuilder.STRING)
  .build())
  .addFieldProp(avro.java.string, String)
  .requiredFixed(serverHash, MD5.SCHEMA$)
  .unionType(meta, SchemaBuilder.unionType(
  SchemaBuilder.NULL,
  SchemaBuilder.mapType(SchemaBuilder.BYTES)
.addFieldProp(avro.java.string, String)
.build())
  .build())
  .build();
  }
{code}

but I am not sure.  Also addFieldProp() does not exist.

What is odd is that there are two unionType() methods, one takes varargs and 
the other does not.  I suspect that the intention was for both to use varargs 
so that the nested union building is not required by the user.

It would be much simpler if unions without defaults had a shortcut:

{code}
  public static final org.apache.avro.Schema SCHEMA$;
  static {
SCHEMA$ = SchemaBuilder
  .recordType(HandshakeRequest)
  .namespace(org.apache.avro.ipc)
  .requiredFixed(clientHash, MD5.SCHEMA$)
  .nullableString(clientProtocol)
 .addFieldProp(avro.java.string, String)
  .requiredFixed(serverHash, MD5.SCHEMA$)
  .nullableMap(SchemaBuilder.BYTES)
.addFieldProp(avro.java.string, String)
  .build()
  }
{code}

Building unions in general feels clunky as well since you have to break 
chaining and use SchemaBuilder again.  Instead of taking a varargs list of 
schemas in the union, the type returned could be a UnionBuilder.  So instead of:
{code}
  public static final org.apache.avro.Schema SCHEMA$;
  static {
SCHEMA$ = SchemaBuilder
  .recordType(Test)
  .namespace(org.apache.avro)
  .unionString(stringField, defaultVal, 
 SchemaBuilder.INT,
 SchemaBuilder.arrayType(SchemaBuilder.INT).build()
 SchemaBuilder.mapType(SchemaBuilder.unionType(
   SchemaBuilder.INT, SchemaBuilderLONG)
   )
)
  .build()
  }
{code}

we could write something more like:
{code}
  public static final org.apache.avro.Schema SCHEMA$;
  static {
SCHEMA$ = SchemaBuilder
  .recordType(Test)
  .namespace(org.apache.avro)
  .unionString(stringFieldName, defaultVal)
 .andInt()
 .andArrayOf().int()
 .andMapOf().unionInt().andLong()
  .build()
  }
{code}

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Fix For: 1.7.5

 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-04-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644503#comment-13644503
 ] 

Hudson commented on AVRO-1274:
--

Integrated in AvroJava #367 (See [https://builds.apache.org/job/AvroJava/367/])
AVRO-1274. Java: Add a schema builder API. (Revision 1476973)

 Result = SUCCESS
tomwhite : 
Files : 
* /avro/trunk/CHANGES.txt
* /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/Schema.java
* /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/SchemaBuilder.java
* 
/avro/trunk/lang/java/avro/src/main/java/org/apache/avro/SchemaBuilderException.java
* 
/avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
* 
/avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericRecordBuilder.java
* 
/avro/trunk/lang/java/avro/src/test/java/org/apache/avro/TestSchemaBuilder.java
* /avro/trunk/lang/java/avro/src/test/resources/SchemaBuilder.avsc


 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Fix For: 1.7.5

 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-04-25 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642247#comment-13642247
 ] 

Scott Carey commented on AVRO-1274:
---

+1 Yes, looks good!

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-04-24 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640961#comment-13640961
 ] 

Tom White commented on AVRO-1274:
-

Scott, are you OK for this to be committed now?

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-04-23 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639516#comment-13639516
 ] 

Scott Carey commented on AVRO-1274:
---

This looks good.  

Minor nit:  perhaps change defaultValue( val) to default(val) for brevity and 
alignment with the name of the property in json.

Minor concern:  How does this API deal with names that are the full name?  

For example, the two below should be the same:
{code}
SchemaBuilder.recordType(myrecord).namespace(org.example).build();
SchemaBuilder.recordType(org.example.myrecord).build()
{code}

But we should document the behavior when mixing the two:
{code}
SchemaBuilder.recordType(org.example1.myrecord).namespace(org.example2).build();
{code}

It would be nice if the builder API behaved consistent with the schema parser 
when provided similar information:
{type: record, name:org.example1.myrecord, namespace:org.example2}

In part because if the builder API was in sync with the parser, we could use it 
in the parser, simplifying the parser and making behavior consistent.


 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-04-23 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639554#comment-13639554
 ] 

Doug Cutting commented on AVRO-1274:


'default' is a reserved word in Java and cannot be used as a method name.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-21 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609230#comment-13609230
 ] 

Tom White commented on AVRO-1274:
-

 nullable with null default, nullable with no default, nullable with a value 
 default, non-nullable with a default, and non-null without a default

This could be confusing! I think we need to make the common cases accessible 
and easy to understand. Required, optional, and optional with a default are all 
common cases. The other two (nullable with no default, and to a lesser extent 
non-nullable with a default) are not, so we need to work out a way of exposing 
them (if we expose them at all at the moment) that makes sense in the context 
of IDE autocomplete, which is how I think this API will be experienced.

One renaming might be the following, but I'm not sure what I think about it.

{noformat}
intType(name)
intType(name, default)
nullableIntType(name)
nullableIntType(name, default)
nullableIntTypeNoDefault(name)
{noformat}

Another way would be to leave the naming we have, and offer an escape hatch for 
advanced users, {{SchemaBuilder.recordType(r).field(f0)...}} with the 
advanced methods.

One thing I do want to avoid is excessive chaining, since if you have something 
like {{name(foo).nullable().int()}} then it's not clear to users what parts 
of the field definition are optional (e.g. nullable is but the type isn't). 
This is why I prefer the overloaded variants of requiredX/optionalX.

Regarding enforcing the default in union types, the following change to the API 
should do it: 

{noformat}
Schema schema = SchemaBuilder.recordType(r)
 .unionLong(myunion).withType(SchemaBuilder.NULL).build();
{noformat}

or

{noformat}
Schema schema = SchemaBuilder.recordType(r)
 .unionLong(myunion, 7L).withType(SchemaBuilder.INT).build();
{noformat}

I'll create a patch for that while we decide what to do about the 
optional/nullable API.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-20 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607768#comment-13607768
 ] 

Doug Cutting commented on AVRO-1274:


Schema, Field, Protocol and Message do actually have a common base class:

http://avro.apache.org/docs/current/api/java/org/apache/avro/JsonProperties.html

I'm not sure how much this can be exploited to simplify generic traversal.  It 
would be nice to have a generic traversal API.  I've started to write one 
several times but given up since it was far easier in each case to write 
another recursive walker with a switch statement.

I believe that Tom's API is sufficiently independent of the underlying Schema 
API that it can survive changes to that.  I'd hate to see the addition of this 
much-needed builder API held back for a re-design of the Schema API.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-20 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607846#comment-13607846
 ] 

Scott Carey commented on AVRO-1274:
---

I agree, don't hold this up.

It appears to be the proper abstraction for the job:  it does not leak 
implementation details and is more a Java definition of the Schema spec.  For 
example:

{code}
public FieldBuilder optionalInt(String name, int defaultValue) {
  return new FieldBuilder(this, name, INT, true, toJsonNode(defaultValue));
}
{code}
does not leak the JsonNode stuff out to the api, and requires that the default 
value is the proper type.   There may be some more work to do to reach all 
parts of the spec or aid ease of use (perhaps in another ticket), but if all 
uses are spec-compatible and type-safe, then it is extremely unlikely we'll 
need an API change to this at any point in the future unless it involves a 
corresponding spec change.



 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-20 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607952#comment-13607952
 ] 

Tom White commented on AVRO-1274:
-

Thanks for taking a look Scott. I agree that over time the builder API can be 
used as a replacement to hide the problems with the existing Schema API from 
users.

Regarding the required field with default value - I'll add that. Also, we could 
check the union's first type is consistent with any default, but I can't see a 
way of getting it to be a compile-time check - we'd have to do it when the 
schema is built. I can make these changes in this JIRA or another one - either 
way works for me. 

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-20 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608218#comment-13608218
 ] 

Scott Carey commented on AVRO-1274:
---

Another type of schema that the builder cannot create (easily) is an optional 
field with no default.  Such a schema is brittle when used to read, but in some 
cases that is desired -- you may want to fail if the data being read does not 
contain a field or matching value at the source.

optional and required don't feel like the right names -- the latter is not 
required, if it has a default value, and the former may be required if it does 
not have a default value.  nullable is a more exact description for the 
former.

This means there are 5 methods per type if we keep the builder similar  -- 
nullable with null default, nullable with no default, nullable with a value 
default, non-nullable with a default, and non-null without a default.

A different way to handle this is to move the default handling to a specialized 
field builder per type (type-builder?) rather than have the method count be 
combinatorial (5 + N methods rather than 5 * N methods, for N types and 5 
default options).  This is the same code that would be required to make union 
defaults type-safe (When building a union, the first type would have to be 
added explicitly and return the appropriate default builder, then other types 
could be added to the union).

We could split it into enough types to make it more composable.  Below are some 
ideas that I haven't thought through completely, and I might take a stab at it 
in 4 weeks:
{code}
  nullableInt(foo).default(1); // for nullable int (a union of null and int, 
which is ordered properly based on whether it has a non-null default)
  nullableInt(foo).defaultNull(); // null default, if missing on read the 
field is null
  nullableInt(foo).required(); // no default value, the field is required

  int(foo).default(-1); // for non-nullable int with default -1;
  int(foo).required();
{code}

or completely chained syntax for each step (which requires several more builder 
types but can be perfectly type safe):
{code}
  name(foo).nullable().int().default(1);  // capture name separately, since 
we want to build types without names elsewhere and those have the same API 
otherwise
  name(foo).nullable().int().nullDefault();
  name(foo).nullable().int().required();
  name(foo).int().default(-1);
  name(foo).int().required();
  name(foo).arrayOf().int().nullable().default(new int[] {0}); // re-use type 
building for fields for array inner type
  name(foo).unionOf().fixed(4).default(new byte[] {127, 0, 0, 
1}).and().fixed(16); // re-use type building again, and also only allow the 
first one to be a type builder that supports defaults, the type builder after 
add() does not support defaults.  We cant prevent unions from adding the same 
type twice at this point without making a type for every combinational subset 
of unnamed types, due to limitations with Java's type system.

  // complex example 
  new RecordTypeBuilder(org.apache.avro.example.Tree)
   
.field(left).nullable().recordReference(org.apache.avro.example.Tree).defaultNull()
   .field(data).string().required()
   
.field(right).nullable().recordReference(org.apache.avro.example.Tree).defaultNull()
   .build();
{code}

nullable() is a special case union
{code}
  field(foo).nullable().int().defaultNull() // a special case binary union of 
null and a single other type

  field(foo).unionOf().null().and().int().defaultNull(); // same, but allows 
for adding more than one additional type to the union and does not support 
rearranging the order of the two for default purposes
{code}



 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-15 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603315#comment-13603315
 ] 

Tom White commented on AVRO-1274:
-

I'm wondering if the correct way to do this is actually to have [null, T] for 
optional fields with no default:

{name: optionalBoolean, type: [ null, boolean ], default: null}

and [T, null] when there is a non-null default:

{name: optionalBooleanWithDefault, type: [ boolean, null ], default 
: true}

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-15 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603636#comment-13603636
 ] 

Doug Cutting commented on AVRO-1274:


 I'm wondering if the correct way to do this is actually to have [null, T] for 
 optional fields with no default [ ... ] and and [T, null] when there is a 
 non-null default.

The latter is certainly required when there is a non-null default.

The former is subtly different.  A reader with a [null, T] union with no 
default value specified still requires that the field be present in the 
writer's schema.  So it's a required nullable field as opposed to an entirely 
optional field.  This subtlety is confusing, so glossing over it in the builder 
API by always generating a default value of null for nullable fields with no 
other default value specified is probably best.


 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-15 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603884#comment-13603884
 ] 

Doug Cutting commented on AVRO-1274:


Some nits:
 - If a default value has a nested bytes then it will fail.  For example, a 
field whose type is a record with a field named 'a' of type bytes can have a 
default value of {a:asdf}, but GenericData.toString() won't generate this 
correctly.  I think this can just remain a known issue until we fix 
GenericData.toString(), but we should probably add a comment noting that.
 - Is SchemaParseException the right exception here?  AvroRuntimeException or 
perhaps some new exception like SchemaBuilderError or somesuch.

Other than that, this looks great!  +1


 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 AVRO-1274.patch, TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-14 Thread Josh Wills (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602459#comment-13602459
 ] 

Josh Wills commented on AVRO-1274:
--

Hey Tom-- I am of no help on the bytes default values problem, I just wanted to 
say that I love the new API. :)

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602567#comment-13602567
 ] 

Doug Cutting commented on AVRO-1274:


This should look like:

{name:optionalBytesWithDefault, type:[null, bytes], default:null}

If a field's type is a union, then the type of the default is the type of the 
first element in the union.  So the only valid default value for a union of the 
form [null, ...] is null.  Some other valid examples of unions with defaults 
are:

{name:f1, type:[string, int], default:}
{name:f2, type:[int, string], default:0}

Default values are different than what JsonEncoder would produce.  It will 
qualify values of a union with their type, rendering {bytes:foo} rather 
than just foo for a value whose schema is [bytes, ...].  But default values 
are not so qualified.

Does that help?

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602613#comment-13602613
 ] 

Doug Cutting commented on AVRO-1274:


Sorry, I wrote the above before looking at your patch and the sources.  That 
{bytes:foo} thing is indeed coming from GenericData#toString.  (It dates 
back to the pre-history of Avro.  I must have had some good intention when I 
added it, but it sure looks evil now.)  We should probably remove it, but that 
would be an incompatible change.  Perhaps the next release should be 1.8.0 
instead of 1.7.5.  There are a few other minor incompatible changes queued that 
would be nice to get out.  Or we can work around this, specially handling 
binary default values.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602789#comment-13602789
 ] 

Doug Cutting commented on AVRO-1274:


Defaults are primarily used at read time to supply values for fields missing 
from the writer's schema.

The builder API will also fill in default values at object creation time (i.e., 
prior to write, typically).  To build generic instances with defaults use 
GenericRecordBuilder.  For example:

with the schema:

{code}
{type:record, name:r, fields:[{name:f, type:int, 
default:0}]}
{code}

then you should see:

{code}
new GenericRecordBuilder(schema).build().toString() - {f, 0}

new GenericRecordBuilder(schema).set(f,1).build().toString() - {f, 1}
{code}



 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch, AVRO-1274.patch, AVRO-1274.patch, 
 TestDefaults.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-13 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601319#comment-13601319
 ] 

Doug Cutting commented on AVRO-1274:


This looks great, Tom, and has long been needed.

 - The downside of calling this Schema.Builder is that it makes the Schema 
class even bigger.  The upside is that if you 'import Schema.Builder' then the 
code is sleeker.  But perhaps the preferred import should instead be 'import 
static SchemaBuilder.*'?  The static methods have unique-enough names that this 
might work well.  What do you think?
  - We can convert from Java object to JsonNode by parsing the output of 
GenericData.toString(Object).

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (AVRO-1274) Add a schema builder API

2013-03-12 Thread Josh Wills (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600838#comment-13600838
 ] 

Josh Wills commented on AVRO-1274:
--

Hey Tom-- I wrote something along these lines way back in the day:

https://github.com/jwills/avroplay/blob/master/src/com/randomgraphs/avro/RecordSchemaBuilder.java

The general orientation is towards supporting the union { null, T } pattern for 
optional fields w/default values, and it ends up looking like:

Schema schema = new RecordSchemaBuilder(myrecord)
.requiredString(foo)
.optionalFloat(bar, 17.29f)
.array(baz, Schema.create(Schema.Type.STRING))
.build();

It has support for default values for primitive types and just wraps them in 
JsonNodes as need be, and is smart about checking to see if your record is 
named or anonymous. I'm happy to re-format it as a patch if you think it's 
worthwhile. My main feeling was that the name, type, and required/optional 
nature of the field are the three things you really always have to know, and 
whether/not you have a doc string or sort order info should be hidden away as 
rarely-used options in this context.

 Add a schema builder API
 

 Key: AVRO-1274
 URL: https://issues.apache.org/jira/browse/AVRO-1274
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Tom White
Assignee: Tom White
 Attachments: AVRO-1274.patch


 It would be nice to have a fluent API that made it easier to construct record 
 schemas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira