C++ interface, json, schema, immuatability

2017-07-06 Thread Dan Schmitt
I was hoping I wasn't doing something complex, but I may be wrong.

My goal state is to allow callers to send arbitrary avro buffers to
me, and then merge them into a schema.

I seem to be stuck in that the exposed methods to extract the schema
from the buffer/stream etc result in a ValidSchema, which I don't seem
to be able to convert to something I can add to an existing schema.

As an example, imagine I have a caller passing me a hunk of memory
with data that has defined their own schema as:

{
  "type" : "record",
  "name" : "ErrorInfo",
  "fields" : [ {
"name" : "File",
"type" : "string"
  }, {
 "name" : "LineNumber",
 "type" : "int"
  } , {
"name" : "Message",
"type" : :string"
  } ]
}


And I want to store each hunk as a record with a timestamp when I got it.

{
  "type" : "record",
  "name" : "Log",
  "fields" : [ {
  "name" : "time",
  "type" : {
"type" : "long",
"logicalType" : "timestamp-millis"
  }, {
"name" : "payload",
"type" : [ "null" ]
} ]
  }
}

I'd like to be able to load up my timestamp log fields, then modify
the payload union to include the ErrorInfo type so somebody that dumps
the avro file can see File/LineNumber/Message and my logger doesn't
have to care about that bit of info (and anybody can make send any
type of structured data to the logger.)

I don't see a way to convert the node/root from ValidSchema to
something I can pass to avro::UnionSchema.addType()   Am I missing
something?

Note that a timestamp logger is a contrived example to express what I
want to do (I know about all the nifty logging systems out there, but
I do have a use case for doing dynamic schema merges.)


Re: Avro schema properties contention on multithread read

2017-07-06 Thread fady
On 05.07.2017 21:53, Zoltan Farkas wrote:

> The synchronization in JsonProperties is curently inconsistent (see 
> getObjectProps()) which makes current implementation @NotThreadSafe 
> 
> I think it would be probably best to remove synchronization from those 
> methods... and add @NotThreadSafe to the class... 
> Utilities like Schemas.synchronizedSchema(...) and 
> Schemas.unmodifiableSchema(...) could be added to help with various use 
> cases... 
> 
> --Z

Thank you for your reply. I like your Schemas.unmodifiableSchema(...) a
lot. 

While what you are describing would be ideal, a simpler solution might
be to change the LinkedHashMap that backs jsonProperties into something
like a ConcurrentHashMap, avoiding the need for synchronization. 

This being said ConcurrentHashMap itself does not preserve insertion
order, so its not a mere replacement to LinkedHashMap.

RE: Impossible to start an union with a named type

2017-07-06 Thread Tony Imbault
Thank you Nandor,

I had already done this test, but in my context it is very important to respect 
the specified order.

In my opinion the class SchemaBuilder.UnionFieldTypeBuilder should provide the 
methods type(name), type (name, namespace) and type(schemaType).

I will check if a jira already exists for this issue.

Thank you

Tony


De : Nandor Kollar [mailto:nkol...@cloudera.com]
Envoyé : jeudi 6 juillet 2017 10:43
À : user@avro.apache.org
Objet : Re: Impossible to start an union with a named type

Hi Tony,

If you change the order of elements in the author field, it is fine:

@Test
public void testUnion() {
  Schema schema = SchemaBuilder
.unionOf()
.record("Author")
.fields()
.name("name").type().stringType().noDefault()
.name("birthday").type().longType().noDefault()
.endRecord().and()
.record("Book")
.fields()
.name("title").type().stringType().noDefault()
.name("author").type().unionOf()
.stringType().and()
.type("Author")
.endUnion()
.noDefault()
.endRecord().endUnion();
}

The specification doesn't say anything about the order of types in a union, so 
I guess it shouldn't limit it, I think you're right, your code should compile. 
Maybe this is a bug in the SchemaBuilder?

Nandor

On Thu, Jul 6, 2017 at 10:14 AM, Tony Imbault 
> wrote:
Hi everybody,

i am trying to build the followong JSON schema:

[{
 "type": "record",
 "name": "Author",
 "fields": [{
  "name": "name",
  "type": "string"
 },
 {
  "name": "birthday",
  "type": "long"
 }]
},
{
 "type": "record",
 "name": "Book",
 "fields": [{
  "name": "title",
  "type": "string"
 },
 {
  "name": "author",
  "type": ["Author", "string"]
 }]
}]

In my humble opinion this JSON schema is valid.

But it seems that the SchemaBuilder API does not permit to build this schema.
The corresponding code should be something like that:

Schema schema = SchemaBuilder
 .unionOf()
 .record("Author")
 .fields()
 .name("name").type().stringType().noDefault()
 .name("birthday").type().longType().noDefault()
 .endRecord().and()
 .record("Book")
 .fields()
 .name("title").type().stringType().noDefault()
 .name("author").type().unionOf()
 .type("Author").and()
 .stringType().endUnion()
 .noDefault()
 .endRecord().endUnion();

But after a call to unionOf() the type(name) method is not available.

Is there any reason for this limitation?

Thank you very much

Tony Imbault



Re: Impossible to start an union with a named type

2017-07-06 Thread Nandor Kollar
Hi Tony,

If you change the order of elements in the author field, it is fine:

@Test
public void testUnion() {
  Schema schema = SchemaBuilder
.unionOf()
.record("Author")
.fields()
.name("name").type().stringType().noDefault()
.name("birthday").type().longType().noDefault()
.endRecord().and()
.record("Book")
.fields()
.name("title").type().stringType().noDefault()
.name("author").type().unionOf()
.stringType().and()
.type("Author")
.endUnion()
.noDefault()
.endRecord().endUnion();
}

The specification doesn't say anything about the order of types in a union,
so I guess it shouldn't limit it, I think you're right, your code should
compile. Maybe this is a bug in the SchemaBuilder?

Nandor

On Thu, Jul 6, 2017 at 10:14 AM, Tony Imbault  wrote:

> Hi everybody,
>
>
>
> i am trying to build the followong JSON schema:
>
>
>
> [{
>
>  "type": "record",
>
>  "name": "Author",
>
>  "fields": [{
>
>   "name": "name",
>
>   "type": "string"
>
>  },
>
>  {
>
>   "name": "birthday",
>
>   "type": "long"
>
>  }]
>
> },
>
> {
>
>  "type": "record",
>
>  "name": "Book",
>
>  "fields": [{
>
>   "name": "title",
>
>   "type": "string"
>
>  },
>
>  {
>
>   "name": "author",
>
>   "type": ["Author", "string"]
>
>  }]
>
> }]
>
>
>
> In my humble opinion this JSON schema is valid.
>
>
>
> But it seems that the SchemaBuilder API does not permit to build this
> schema.
>
> The corresponding code should be something like that:
>
>
>
> Schema schema = SchemaBuilder
>
>  .unionOf()
>
>  .record("Author")
>
>  .fields()
>
>  .name("name").type().stringType().noDefault()
>
>  .name("birthday").type().
> longType().noDefault()
>
>  .endRecord().and()
>
>  .record("Book")
>
>  .fields()
>
>  .name("title").type().stringType().noDefault()
>
>  .name("author").type().unionOf()
>
>  .type("Author").and()
>
>  .stringType().endUnion()
>
>  .noDefault()
>
>  .endRecord().endUnion();
>
>
>
> But after a call to unionOf() the type(name) method is not available.
>
>
>
> Is there any reason for this limitation?
>
>
>
> Thank you very much
>
>
>
> Tony Imbault
>


Impossible to start an union with a named type

2017-07-06 Thread Tony Imbault
Hi everybody,

i am trying to build the followong JSON schema:

[{
 "type": "record",
 "name": "Author",
 "fields": [{
  "name": "name",
  "type": "string"
 },
 {
  "name": "birthday",
  "type": "long"
 }]
},
{
 "type": "record",
 "name": "Book",
 "fields": [{
  "name": "title",
  "type": "string"
 },
 {
  "name": "author",
  "type": ["Author", "string"]
 }]
}]

In my humble opinion this JSON schema is valid.

But it seems that the SchemaBuilder API does not permit to build this schema.
The corresponding code should be something like that:

Schema schema = SchemaBuilder
 .unionOf()
 .record("Author")
 .fields()
 .name("name").type().stringType().noDefault()
 .name("birthday").type().longType().noDefault()
 .endRecord().and()
 .record("Book")
 .fields()
 .name("title").type().stringType().noDefault()
 .name("author").type().unionOf()
 .type("Author").and()
 .stringType().endUnion()
 .noDefault()
 .endRecord().endUnion();

But after a call to unionOf() the type(name) method is not available.

Is there any reason for this limitation?

Thank you very much

Tony Imbault