[jira] [Updated] (AVRO-2283) Typo in docs (three => five)

2018-12-11 Thread Ross Blanchard (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ross Blanchard updated AVRO-2283:
-
Attachment: AVRO-2283.patch

> Typo in docs (three => five)
> 
>
> Key: AVRO-2283
> URL: https://issues.apache.org/jira/browse/AVRO-2283
> Project: Apache Avro
>  Issue Type: Bug
>  Components: spec
>Affects Versions: 1.8.2
>Reporter: Ross Blanchard
>Priority: Trivial
> Attachments: AVRO-2283.patch
>
>
> Docs state that the `record` type supports three attributes, then lists five 
> attributes that it supports: 
> https://github.com/apache/avro/blob/9855529f1af979ade479d5658bbe90917ae2734b/doc/src/content/xdocs/spec.xml#L89



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AVRO-2283) Typo in docs (three => five)

2018-12-11 Thread Ross Blanchard (JIRA)
Ross Blanchard created AVRO-2283:


 Summary: Typo in docs (three => five)
 Key: AVRO-2283
 URL: https://issues.apache.org/jira/browse/AVRO-2283
 Project: Apache Avro
  Issue Type: Bug
  Components: spec
Affects Versions: 1.8.2
Reporter: Ross Blanchard


Docs state that the `record` type supports three attributes, then lists five 
attributes that it supports: 
https://github.com/apache/avro/blob/9855529f1af979ade479d5658bbe90917ae2734b/doc/src/content/xdocs/spec.xml#L89



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1749) Maven plugin goal: automatic schemas (.avsc) generation from Java classes (.java)

2018-12-11 Thread Daniel Kulp (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Kulp resolved AVRO-1749.
---
   Resolution: Fixed
 Assignee: Daniel Kulp
Fix Version/s: 1.9.0

> Maven plugin goal: automatic schemas (.avsc) generation from Java classes 
> (.java)
> -
>
> Key: AVRO-1749
> URL: https://issues.apache.org/jira/browse/AVRO-1749
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Matheus Santana
>Assignee: Daniel Kulp
>Priority: Minor
> Fix For: 1.9.0
>
>
> Current maven plugin includes goals for generating Java code (classes and 
> interfaces) from IDL (.avdl files) and Avro protocol / schemas definitions 
> (.avpr / .avsc).
> It would be nice to provide a goal for automatic [induced schemas from Java 
> code|https://avro.apache.org/docs/current/api/java/org/apache/avro/tool/InduceSchemaTool.html].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1749) Maven plugin goal: automatic schemas (.avsc) generation from Java classes (.java)

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717894#comment-16717894
 ] 

ASF GitHub Bot commented on AVRO-1749:
--

dkulp closed pull request #70: AVRO-1749 Java: Introduce induce Maven plugin 
goal
URL: https://github.com/apache/avro/pull/70
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/lang/java/maven-plugin/src/main/java/org/apache/avro/mojo/InduceMojo.java 
b/lang/java/maven-plugin/src/main/java/org/apache/avro/mojo/InduceMojo.java
new file mode 100644
index 0..9a29bb650
--- /dev/null
+++ b/lang/java/maven-plugin/src/main/java/org/apache/avro/mojo/InduceMojo.java
@@ -0,0 +1,137 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.avro.mojo;
+
+import java.io.File;
+import java.io.PrintWriter;
+import java.net.URL;
+import java.net.URLClassLoader;
+import java.util.List;
+
+import org.apache.avro.reflect.ReflectData;
+import org.apache.maven.plugin.AbstractMojo;
+import org.apache.maven.plugin.MojoExecutionException;
+import org.apache.maven.project.MavenProject;
+
+/**
+ * Generate Avro files (.avsc and .avpr) from Java classes or interfaces
+ * 
+ * @goal induce
+ * @phase process-classes
+ * @threadSafe
+ */
+public class InduceMojo extends AbstractMojo {
+  /**
+   * The source directory of Java classes.
+   *
+   * @parameter property="sourceDirectory"
+   *default-value="${basedir}/src/main/java"
+   */
+  private File sourceDirectory;
+
+  /**
+   * Directory where to output Avro schemas (.avsc) or protocols (.avpr).
+   *
+   * @parameter property="outputDirectory"
+   *default-value="${basedir}/generated-resources/avro"
+   */
+  private File outputDirectory;
+
+  /**
+   * The current Maven project.
+   * 
+   * @parameter default-value="${project}"
+   * @readonly
+   * @required
+   */
+  protected MavenProject project;
+
+  public void execute() throws MojoExecutionException {
+ClassLoader classLoader = getClassLoader();
+
+for(File inputFile : sourceDirectory.listFiles()) {
+  String className = parseClassName(inputFile.getPath());
+  Class klass = loadClass(classLoader, className);
+  String fileName = outputDirectory.getPath() + "/" + parseFileName(klass);
+  File outputFile = new File(fileName);
+  outputFile.getParentFile().mkdirs();
+  try {
+PrintWriter writer = new PrintWriter(fileName, "UTF-8");
+if(klass.isInterface()) {
+  writer.println(ReflectData.get().getProtocol(klass).toString(true));
+} else {
+  writer.println(ReflectData.get().getSchema(klass).toString(true));
+}
+writer.close();
+  } catch(Exception e) {
+e.printStackTrace();
+  }
+}
+  }
+
+  private String parseClassName(String fileName) {
+String indentifier = "java/";
+int index = fileName.lastIndexOf(indentifier);
+String namespacedFileName = fileName.substring(index + 
indentifier.length());
+
+return namespacedFileName.replace("/", ".").replace(".java", "");
+  }
+
+  private String parseFileName(Class klass) {
+String className = klass.getName().replace(".", "/");
+if(klass.isInterface()) {
+  return className.concat(".avpr");
+} else {
+  return className.concat(".avsc");
+}
+  }
+
+  private Class loadClass(ClassLoader classLoader, String className) {
+Class klass = null;
+
+try {
+  klass = classLoader.loadClass(className);
+} catch(ClassNotFoundException e) {
+  e.printStackTrace();
+}
+
+return klass;
+  }
+
+  private ClassLoader getClassLoader() {
+ClassLoader classLoader = null;
+
+try {
+  List classpathElements = project.getRuntimeClasspathElements();
+  if(null == classpathElements) {
+return Thread.currentThread().getContextClassLoader();
+  }
+  URL[] urls = new URL[classpathElements.size()];
+
+  for(int i = 0; i < 

[jira] [Commented] (AVRO-1749) Maven plugin goal: automatic schemas (.avsc) generation from Java classes (.java)

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717895#comment-16717895
 ] 

ASF subversion and git services commented on AVRO-1749:
---

Commit 4b498e5c76569dfa7acb7e62f175075db3d314e2 in avro's branch 
refs/heads/master from [~embs]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4b498e5 ]

AVRO-1749 Java: Introduce induce Maven plugin goal (#70)



> Maven plugin goal: automatic schemas (.avsc) generation from Java classes 
> (.java)
> -
>
> Key: AVRO-1749
> URL: https://issues.apache.org/jira/browse/AVRO-1749
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Matheus Santana
>Priority: Minor
>
> Current maven plugin includes goals for generating Java code (classes and 
> interfaces) from IDL (.avdl files) and Avro protocol / schemas definitions 
> (.avpr / .avsc).
> It would be nice to provide a goal for automatic [induced schemas from Java 
> code|https://avro.apache.org/docs/current/api/java/org/apache/avro/tool/InduceSchemaTool.html].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2173) remove CHANGES.txt

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717791#comment-16717791
 ] 

ASF subversion and git services commented on AVRO-2173:
---

Commit 029965d21903204ff2e5a21db036cac195780742 in avro's branch 
refs/heads/branch-1.8 from [~nielsbasjes]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=029965d ]

AVRO-2173: remove CHANGES.txt


> remove CHANGES.txt
> --
>
> Key: AVRO-2173
> URL: https://issues.apache.org/jira/browse/AVRO-2173
> Project: Apache Avro
>  Issue Type: Improvement
>Reporter: Doug Cutting
>Priority: Major
>
> The CHANGES.txt file is not well maintained and redundant with information in 
> Jira.
> Let's remove this file, and instead generate release notes from Jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-2065) Avro java code generation for Unions doesn't set converters for unions

2018-12-11 Thread Daniel Kulp (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Kulp resolved AVRO-2065.
---
   Resolution: Fixed
Fix Version/s: (was: 1.8.3)
   1.9.0

> Avro java code generation for Unions doesn't set converters for unions
> --
>
> Key: AVRO-2065
> URL: https://issues.apache.org/jira/browse/AVRO-2065
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Stephane Maarek
>Priority: Blocker
> Fix For: 1.9.0
>
>
> The full issue is here: 
> https://stackoverflow.com/questions/45581437/how-to-specify-converter-for-default-value-in-avro-union-logical-type-fields
> {code}
> {
>  "name":"my_optional_date",
>  "type":[
> {
>"type":"long",
>"logicalType":"timestamp-millis"
> },
> "null"
>  ],
>  "default":1502250227187
>   }
> {code}
> Doesn't generate the right conversions
> {code}
> private static final org.apache.avro.Conversion[] conversions =
>   new org.apache.avro.Conversion[] {
>   null,  // <-- THIS ONE IS NOT SET PROPERLY, should be 
> TIMESTAMP_CONVERSION
>   null
>   };
> {code}
> The code fails on 
> {code}
> BuggyRecord.Builder buggyRecordBuilder = BuggyRecord.newBuilder();
> buggyRecordBuilder.build();
> {code}
> with 
> {code}
> org.apache.avro.AvroRuntimeException: java.lang.ClassCastException: 
> java.lang.Long cannot be cast to org.joda.time.DateTime
> at com.example.BuggyRecord$Builder.build(BuggyRecord.java:301)
> at BuggyRecordTest.Foo(BuggyRecordTest.java:10)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> org.joda.time.DateTime
> at com.example.BuggyRecord$Builder.build(BuggyRecord.java:298)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-766) C: Memory leak from reference count cycles

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717802#comment-16717802
 ] 

ASF subversion and git services commented on AVRO-766:
--

Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch 
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]

AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)

* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK

- Add hash of named schemas found during copy
- Find saved named  schema for copy of AVRO_LINK

* AVRO-766: Correct memory leaks in AVRO_LINK copy

- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy

* Enable TEST_AVRO_1167 in test_avro_766

This ensures that both fixes work together and that no valgrind errors are 
produced from a recrusive schema.


> C: Memory leak from reference count cycles
> --
>
> Key: AVRO-766
> URL: https://issues.apache.org/jira/browse/AVRO-766
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.5.0
>Reporter: Douglas Creager
>Priority: Major
> Attachments: AVRO-766.patch, ref-cycle.c
>
>
> If you parse a recursive Avro schema, you end up with a cycle in the 
> reference graph for the avro_schema_t objects that are created.  The 
> reference counting mechanism that we're using can't detect this, and so you 
> get a memory leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1858) Update DataFileReadTool (tojson) to support a "head" concept

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717796#comment-16717796
 ] 

ASF subversion and git services commented on AVRO-1858:
---

Commit 6970d2f64ab77748f895bb82b0fa6cd512a34c30 in avro's branch 
refs/heads/branch-1.8 from MikeHurleySurescripts
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6970d2f ]

AVRO-1858 add tojson head mode (#100)

* AVRO-1858: added --head option to the tojson operation

* AVRO-1858: added unit tests for tojson --head option

* AVRO-1858: head input and record counters are now longs

* AVRO-1858: added tojson --head tests for zero and negative values. Negative 
head count is now an error.


> Update DataFileReadTool (tojson) to support a "head" concept
> 
>
> Key: AVRO-1858
> URL: https://issues.apache.org/jira/browse/AVRO-1858
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Mike Hurley
>Assignee: Mike Hurley
>Priority: Major
> Fix For: 1.9.0
>
>
> It would be nice if the tojson operator supported a "head" concept in order 
> to get a sampling of records in an Avro file.
> Allow specifying a maximum record count to display. If no max is given in 
> head mode, use a reasonable default (like 10).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1167) Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717797#comment-16717797
 ] 

ASF subversion and git services commented on AVRO-1167:
---

Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch 
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]

AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)

* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK

- Add hash of named schemas found during copy
- Find saved named  schema for copy of AVRO_LINK

* AVRO-766: Correct memory leaks in AVRO_LINK copy

- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy

* Enable TEST_AVRO_1167 in test_avro_766

This ensures that both fixes work together and that no valgrind errors are 
produced from a recrusive schema.


> Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.
> -
>
> Key: AVRO-1167
> URL: https://issues.apache.org/jira/browse/AVRO-1167
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.1
> Environment: Ubuntu Linux 11.10
>Reporter: Vivek Nadkarni
>Priority: Major
> Attachments: AVRO-1167-TEST.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a 
> new_schema, it sets the target of the new_link to the target of the old_link 
> in the old_schema. Thus, the AVRO_LINK in the new_schema points to an element 
> in the old_schema. 
> While this is currently safe, since the reference count of the target in the 
> old_schema is incremented, we are not really making a "copy" of the schema.
> There is a "TODO" in the code, which says that we should make a 
> avro_schema_copy() of the target in old_schema instead of linking directly to 
> it. However, this solution of making a copy would result in a few problems:
> 1. Avro schemas are intended to be self-contained. That implies that 
> AVRO_LINKs are intended to be internal links inside a self-contained schema. 
> The code introduces unnecessary (and potentially disallowed) external 
> dependencies in an Avro schema. 
> 2. The purpose of copying a schema that we want to decouple the old_schema 
> from the new_schema. The two copies may have different owners, we may want to 
> deallocate old schema etc.
> 3. If the schema is recursive, then the code would enter an infinite 
> recursion loop.
> It appears to me that the "correct" solution would be to replicate the entire 
> structure of the current schema, including the internal links. This means 
> that if old_link_A points to old_target_B, then new_link_A should point to 
> new_target_B in the new schema. Note that there should only be one copy of 
> new_target_B in the new schema, even if there are multiple links pointing to 
> new_target_B - i.e. we should not make a new copy for each link.
> In order to implement this proper copying of links, we would need to keep a 
> lookup table of pairs of old and new schemas as they are being created, as 
> well as a list of all the AVRO_LINKs that are copied. Then as a post-copy 
> step, we would go and fix up all the AVRO_LINKs to point to the appropriate 
> targets. This is the way the schema is constructed in the first place in 
> avro_schema_from_json().
> An inefficient way to obtain the correct result from avro_schema_copy() would 
> be to perform an avro_schema_to_json() followed by an avro_schema_from_json().
> Note: I have not implemented a fix for this issue, but I am documenting this 
> issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766 
> can be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1957) TimeConversions do not implement getRecommendedSchema()

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717789#comment-16717789
 ] 

ASF subversion and git services commented on AVRO-1957:
---

Commit 3fe8df71592b149500f2f5e1c666268ded63d1fc in avro's branch 
refs/heads/branch-1.8 from [~nstimm]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=3fe8df7 ]

AVRO-1957: TimeConversions do not implement getRecommendedSchema()

This closes #154

Signed-off-by: Gabor Szadovszky 
Signed-off-by: sacharya 
Signed-off-by: Nandor Kollar 
(cherry picked from commit 027d196fd3f6f990977a4e86c1c07464506cabc3)


> TimeConversions do not implement getRecommendedSchema()
> ---
>
> Key: AVRO-1957
> URL: https://issues.apache.org/jira/browse/AVRO-1957
> Project: Apache Avro
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Sean Timm
>Assignee: Sean Timm
>Priority: Major
> Fix For: 1.9.0, 1.8.3
>
>
> org.apache.avro.data.TimeConversions.TimestampConversion and other date and 
> time conversions do not implement getRecommendedSchema(). When trying to 
> dynamically generate an Avro schema from a pojo that contains a DateTime 
> object using ReflectData, I get an unsupported operation exception.
> I think the implementation should be as simple as
> {code}
> @Override
> public Schema getRecommendedSchema() {
>   return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1954) Schema.Field.defaultVal() generates: Unknown datum type org.apache.avro.JsonProperties$Null

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717790#comment-16717790
 ] 

ASF subversion and git services commented on AVRO-1954:
---

Commit 10792f777f5a31f8c1e5d9a6f6c6d4ba2727ab58 in avro's branch 
refs/heads/branch-1.8 from [~nkollar]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=10792f7 ]

AVRO-1954 - Schema.Field.defaultVal() generates: Unknown datum type 
org.apache.avro.JsonProperties$Null. Contributed by Nandor Kollar

(cherry picked from commit d9338a4cf008b445ea3efbe2523288d07162ec71)


> Schema.Field.defaultVal() generates: Unknown datum type 
> org.apache.avro.JsonProperties$Null
> ---
>
> Key: AVRO-1954
> URL: https://issues.apache.org/jira/browse/AVRO-1954
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1, 1.9.0
>Reporter: rui miranda
>Assignee: Nandor Kollar
>Priority: Minor
> Fix For: 1.9.0
>
> Attachments: unitTestDefaultNull.patch
>
>
> I was creating GenericRecords and populating some fields -- which i could not 
> find the content on some json files -- with the Schema.Field.defaultVal(). 
> It seems if the schema has explicitly set the default value to be null, the 
> records generated this way can't be written. In this case, if default value 
> is null in the schema, an instance of 
> org.apache.avro.JsonProperties.NULL_VALUE is returned by 
> Schema.Field.defaultVal().
> I created an unit test which replicates the bug. I was thinking modify the 
> class org.apache.avro.generic.GenericData to evaluate 
> org.apache.avro.JsonProperties.NULL_VALUE as null. Is this the way to go? or 
> org.apache.avro.JsonProperties.NULL_VALUE is intend for other purposes?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1966) NPE When copying builder with nullable record

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717784#comment-16717784
 ] 

ASF subversion and git services commented on AVRO-1966:
---

Commit 8bed58c6ca57f43c20d2c0eae30e0d4e82d5e6fc in avro's branch 
refs/heads/branch-1.8 from [~nielsbasjes]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=8bed58c ]

AVRO-1966: Java: Fix NPE When copying builder with nullable record.


> NPE When copying builder with nullable record
> -
>
> Key: AVRO-1966
> URL: https://issues.apache.org/jira/browse/AVRO-1966
> Project: Apache Avro
>  Issue Type: Bug
>Affects Versions: 1.8.1
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Critical
> Fix For: 1.9.0, 1.8.3
>
>
> Assume a schema with a record that embeds a record that is optional (i.e. the 
> reference is union with null) and has the default value null.
> Then create a builder and copy that builder into a new builder.
> Using this copy will yield a NulPointerException. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2122) Cannot validate schemas with recursive definitions

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717788#comment-16717788
 ] 

ASF subversion and git services commented on AVRO-2122:
---

Commit 49471412e5a10ff7b4f2806a1af03372d12b7945 in avro's branch 
refs/heads/branch-1.8 from Bart
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4947141 ]

AVRO-2122: Cannot validate schemas with recursive definitions

Track which symbols have been visited to avoid StackOverflowErrors
when validating schemas with recursive definitions

This closes #276

Signed-off-by: Nandor Kollar 

(cherry picked from commit 7f9cbca12af13d4b8b5709edba2bae4d4a808102)


> Cannot validate schemas with recursive definitions
> --
>
> Key: AVRO-2122
> URL: https://issues.apache.org/jira/browse/AVRO-2122
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Bart
>Assignee: Bart
>Priority: Major
> Fix For: 1.7.8, 1.9.0, 1.8.3
>
>
> Validating a schema with a recursive definition will lead to a stack 
> overflow. When using the following schema definition:
> {noformat}
> @namespace("avro")
> protocol Unused {
> record Node {
>   union { null, Node } value = null;
> }
> }
> {noformat}
> {code:java}
> final SchemaValidator backwardValidator = new SchemaValidatorBuilder()
>   .canReadStrategy().validateLatest();
> backwardValidator.validate(Node.SCHEMA$, Arrays.asList(Node.SCHEMA$));
> {code}
> It results in a stack trace:
> {noformat}
> java.lang.StackOverflowError
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:406)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:392)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:383)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:406)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:392)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:406)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:374)
> at org.apache.avro.io.parsing.Symbol.hasErrors(Symbol.java:406)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2120) NullPointerException thrown by Schema.Parser#parse(literal)

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717786#comment-16717786
 ] 

ASF subversion and git services commented on AVRO-2120:
---

Commit 9754e0489093f295402fb370018caaa422d4a2ed in avro's branch 
refs/heads/branch-1.8 from [~nielsbasjes]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=9754e04 ]

AVRO-2120: Fix NullPointerException thrown by Schema.Parser#parse("")


> NullPointerException thrown by Schema.Parser#parse(literal)
> ---
>
> Key: AVRO-2120
> URL: https://issues.apache.org/jira/browse/AVRO-2120
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Sebastien Dubois
>Assignee: Niels Basjes
>Priority: Major
> Fix For: 1.9.0, 1.8.3
>
>
> Calling the parse method with an invalid input (e.g., "") instead of a valid 
> schema throws a NullPointerException.
> Expected behavior: SchemaParseException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1858) Update DataFileReadTool (tojson) to support a "head" concept

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717792#comment-16717792
 ] 

ASF subversion and git services commented on AVRO-1858:
---

Commit 6970d2f64ab77748f895bb82b0fa6cd512a34c30 in avro's branch 
refs/heads/branch-1.8 from MikeHurleySurescripts
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6970d2f ]

AVRO-1858 add tojson head mode (#100)

* AVRO-1858: added --head option to the tojson operation

* AVRO-1858: added unit tests for tojson --head option

* AVRO-1858: head input and record counters are now longs

* AVRO-1858: added tojson --head tests for zero and negative values. Negative 
head count is now an error.


> Update DataFileReadTool (tojson) to support a "head" concept
> 
>
> Key: AVRO-1858
> URL: https://issues.apache.org/jira/browse/AVRO-1858
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Mike Hurley
>Assignee: Mike Hurley
>Priority: Major
> Fix For: 1.9.0
>
>
> It would be nice if the tojson operator supported a "head" concept in order 
> to get a sampling of records in an Avro file.
> Allow specifying a maximum record count to display. If no max is given in 
> head mode, use a reasonable default (like 10).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1858) Update DataFileReadTool (tojson) to support a "head" concept

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717793#comment-16717793
 ] 

ASF subversion and git services commented on AVRO-1858:
---

Commit 6970d2f64ab77748f895bb82b0fa6cd512a34c30 in avro's branch 
refs/heads/branch-1.8 from MikeHurleySurescripts
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6970d2f ]

AVRO-1858 add tojson head mode (#100)

* AVRO-1858: added --head option to the tojson operation

* AVRO-1858: added unit tests for tojson --head option

* AVRO-1858: head input and record counters are now longs

* AVRO-1858: added tojson --head tests for zero and negative values. Negative 
head count is now an error.


> Update DataFileReadTool (tojson) to support a "head" concept
> 
>
> Key: AVRO-1858
> URL: https://issues.apache.org/jira/browse/AVRO-1858
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Mike Hurley
>Assignee: Mike Hurley
>Priority: Major
> Fix For: 1.9.0
>
>
> It would be nice if the tojson operator supported a "head" concept in order 
> to get a sampling of records in an Avro file.
> Allow specifying a maximum record count to display. If no max is given in 
> head mode, use a reasonable default (like 10).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1167) Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717799#comment-16717799
 ] 

ASF subversion and git services commented on AVRO-1167:
---

Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch 
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]

AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)

* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK

- Add hash of named schemas found during copy
- Find saved named  schema for copy of AVRO_LINK

* AVRO-766: Correct memory leaks in AVRO_LINK copy

- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy

* Enable TEST_AVRO_1167 in test_avro_766

This ensures that both fixes work together and that no valgrind errors are 
produced from a recrusive schema.


> Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.
> -
>
> Key: AVRO-1167
> URL: https://issues.apache.org/jira/browse/AVRO-1167
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.1
> Environment: Ubuntu Linux 11.10
>Reporter: Vivek Nadkarni
>Priority: Major
> Attachments: AVRO-1167-TEST.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a 
> new_schema, it sets the target of the new_link to the target of the old_link 
> in the old_schema. Thus, the AVRO_LINK in the new_schema points to an element 
> in the old_schema. 
> While this is currently safe, since the reference count of the target in the 
> old_schema is incremented, we are not really making a "copy" of the schema.
> There is a "TODO" in the code, which says that we should make a 
> avro_schema_copy() of the target in old_schema instead of linking directly to 
> it. However, this solution of making a copy would result in a few problems:
> 1. Avro schemas are intended to be self-contained. That implies that 
> AVRO_LINKs are intended to be internal links inside a self-contained schema. 
> The code introduces unnecessary (and potentially disallowed) external 
> dependencies in an Avro schema. 
> 2. The purpose of copying a schema that we want to decouple the old_schema 
> from the new_schema. The two copies may have different owners, we may want to 
> deallocate old schema etc.
> 3. If the schema is recursive, then the code would enter an infinite 
> recursion loop.
> It appears to me that the "correct" solution would be to replicate the entire 
> structure of the current schema, including the internal links. This means 
> that if old_link_A points to old_target_B, then new_link_A should point to 
> new_target_B in the new schema. Note that there should only be one copy of 
> new_target_B in the new schema, even if there are multiple links pointing to 
> new_target_B - i.e. we should not make a new copy for each link.
> In order to implement this proper copying of links, we would need to keep a 
> lookup table of pairs of old and new schemas as they are being created, as 
> well as a list of all the AVRO_LINKs that are copied. Then as a post-copy 
> step, we would go and fix up all the AVRO_LINKs to point to the appropriate 
> targets. This is the way the schema is constructed in the first place in 
> avro_schema_from_json().
> An inefficient way to obtain the correct result from avro_schema_copy() would 
> be to perform an avro_schema_to_json() followed by an avro_schema_from_json().
> Note: I have not implemented a fix for this issue, but I am documenting this 
> issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766 
> can be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1967) NPE calling getXyzBuilder on instance where the xyz is null

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717785#comment-16717785
 ] 

ASF subversion and git services commented on AVRO-1967:
---

Commit e39170ef83fb06d09b2ad12afc9c6a1b429dfdf9 in avro's branch 
refs/heads/branch-1.8 from [~nielsbasjes]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=e39170e ]

AVRO-1967: Java: Fix NPE when calling getXyzBuilder on instance where the xyz 
is null


> NPE calling getXyzBuilder on instance where the xyz is null
> ---
>
> Key: AVRO-1967
> URL: https://issues.apache.org/jira/browse/AVRO-1967
> Project: Apache Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Critical
> Fix For: 1.9.0, 1.8.3
>
>
> Assume a Record with a nested nullable record that has been set to the value 
> 'null'.
> Then call the getXxxBuilder method on that record to obtain a builder for 
> that nested record and you get an NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2109) Reset buffers in case of IOException

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717787#comment-16717787
 ] 

ASF subversion and git services commented on AVRO-2109:
---

Commit a731fab500606404ecfd755717b441109ccf7337 in avro's branch 
refs/heads/branch-1.8 from [~gszadovszky]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=a731fab ]

AVRO-2109: Reset buffers in case of IOException

Closes #260

Signed-off-by: Zoltan Ivanfi 
Signed-off-by: sacharya 
Signed-off-by: Nandor Kollar 
(cherry picked from commit 673261c8656124cc58bee65fe5e8c779350779ee)


> Reset buffers in case of IOException
> 
>
> Key: AVRO-2109
> URL: https://issues.apache.org/jira/browse/AVRO-2109
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Gabor Szadovszky
>Assignee: Gabor Szadovszky
>Priority: Major
> Fix For: 1.7.8, 1.9.0, 1.8.3
>
>
> In case of an {{IOException}} is thrown out from 
> {{DataFileWriter.writeBlock}} the {{buffer}} and {{blockCount}} are not reset 
> therefore duplicated data is written out when {{close}}/{{flush}}.
> This is actually a conceptual question whether we should reset the buffer or 
> not in case of an exception. In case of an exception occurs during writing 
> the file we shall expect that the file will be corrupt. So, the possible 
> duplication of data shall not matter.
> In the other hand if the file is already corrupt why would we try to write 
> anything again at file close?
> This issue comes from a Flume issue where the HDFS wait thread is interrupted 
> because of a timeout during writing an Avro file. The actual block is 
> properly written already but because of the {{IOException}} caused by the 
> thread interrupt we invoke {{close()}} on the writer which writes the block 
> again with some other stuff (maybe duplicated sync marker) that makes the 
> file corrupt.
> [~busbey], [~nkollar], [~zi], any thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1167) Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717803#comment-16717803
 ] 

ASF subversion and git services commented on AVRO-1167:
---

Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch 
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]

AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)

* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK

- Add hash of named schemas found during copy
- Find saved named  schema for copy of AVRO_LINK

* AVRO-766: Correct memory leaks in AVRO_LINK copy

- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy

* Enable TEST_AVRO_1167 in test_avro_766

This ensures that both fixes work together and that no valgrind errors are 
produced from a recrusive schema.


> Avro-C: avro_schema_copy() does not copy AVRO_LINKs properly.
> -
>
> Key: AVRO-1167
> URL: https://issues.apache.org/jira/browse/AVRO-1167
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.1
> Environment: Ubuntu Linux 11.10
>Reporter: Vivek Nadkarni
>Priority: Major
> Attachments: AVRO-1167-TEST.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When avro_schema_copy() encounters an AVRO_LINK from an old_schema to a 
> new_schema, it sets the target of the new_link to the target of the old_link 
> in the old_schema. Thus, the AVRO_LINK in the new_schema points to an element 
> in the old_schema. 
> While this is currently safe, since the reference count of the target in the 
> old_schema is incremented, we are not really making a "copy" of the schema.
> There is a "TODO" in the code, which says that we should make a 
> avro_schema_copy() of the target in old_schema instead of linking directly to 
> it. However, this solution of making a copy would result in a few problems:
> 1. Avro schemas are intended to be self-contained. That implies that 
> AVRO_LINKs are intended to be internal links inside a self-contained schema. 
> The code introduces unnecessary (and potentially disallowed) external 
> dependencies in an Avro schema. 
> 2. The purpose of copying a schema that we want to decouple the old_schema 
> from the new_schema. The two copies may have different owners, we may want to 
> deallocate old schema etc.
> 3. If the schema is recursive, then the code would enter an infinite 
> recursion loop.
> It appears to me that the "correct" solution would be to replicate the entire 
> structure of the current schema, including the internal links. This means 
> that if old_link_A points to old_target_B, then new_link_A should point to 
> new_target_B in the new schema. Note that there should only be one copy of 
> new_target_B in the new schema, even if there are multiple links pointing to 
> new_target_B - i.e. we should not make a new copy for each link.
> In order to implement this proper copying of links, we would need to keep a 
> lookup table of pairs of old and new schemas as they are being created, as 
> well as a list of all the AVRO_LINKs that are copied. Then as a post-copy 
> step, we would go and fix up all the AVRO_LINKs to point to the appropriate 
> targets. This is the way the schema is constructed in the first place in 
> avro_schema_from_json().
> An inefficient way to obtain the correct result from avro_schema_copy() would 
> be to perform an avro_schema_to_json() followed by an avro_schema_from_json().
> Note: I have not implemented a fix for this issue, but I am documenting this 
> issue in AVRO-JIRA because this issue needs to be resolved before AVRO-766 
> can be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-766) C: Memory leak from reference count cycles

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717801#comment-16717801
 ] 

ASF subversion and git services commented on AVRO-766:
--

Commit 1f00b1a7a2f76921325ba073de3212a4a22524de in avro's branch 
refs/heads/branch-1.8 from John Gill
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=1f00b1a ]

AVRO-1167, AVRO-766: Fix c AVRO_LINK memory leaks (#217)

* AVRO-1167: Enhance avro_schema_copy for AVRO_LINK

- Add hash of named schemas found during copy
- Find saved named  schema for copy of AVRO_LINK

* AVRO-766: Correct memory leaks in AVRO_LINK copy

- Adds test cases for AVRO-766 & AVRO-1167
- Corrects reference counting for avro_schema_copy

* Enable TEST_AVRO_1167 in test_avro_766

This ensures that both fixes work together and that no valgrind errors are 
produced from a recrusive schema.


> C: Memory leak from reference count cycles
> --
>
> Key: AVRO-766
> URL: https://issues.apache.org/jira/browse/AVRO-766
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.5.0
>Reporter: Douglas Creager
>Priority: Major
> Attachments: AVRO-766.patch, ref-cycle.c
>
>
> If you parse a recursive Avro schema, you end up with a cycle in the 
> reference graph for the avro_schema_t objects that are created.  The 
> reference counting mechanism that we're using can't detect this, and so you 
> get a memory leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1858) Update DataFileReadTool (tojson) to support a "head" concept

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717794#comment-16717794
 ] 

ASF subversion and git services commented on AVRO-1858:
---

Commit 6970d2f64ab77748f895bb82b0fa6cd512a34c30 in avro's branch 
refs/heads/branch-1.8 from MikeHurleySurescripts
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6970d2f ]

AVRO-1858 add tojson head mode (#100)

* AVRO-1858: added --head option to the tojson operation

* AVRO-1858: added unit tests for tojson --head option

* AVRO-1858: head input and record counters are now longs

* AVRO-1858: added tojson --head tests for zero and negative values. Negative 
head count is now an error.


> Update DataFileReadTool (tojson) to support a "head" concept
> 
>
> Key: AVRO-1858
> URL: https://issues.apache.org/jira/browse/AVRO-1858
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Mike Hurley
>Assignee: Mike Hurley
>Priority: Major
> Fix For: 1.9.0
>
>
> It would be nice if the tojson operator supported a "head" concept in order 
> to get a sampling of records in an Avro file.
> Allow specifying a maximum record count to display. If no max is given in 
> head mode, use a reasonable default (like 10).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1858) Update DataFileReadTool (tojson) to support a "head" concept

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717795#comment-16717795
 ] 

ASF subversion and git services commented on AVRO-1858:
---

Commit 6970d2f64ab77748f895bb82b0fa6cd512a34c30 in avro's branch 
refs/heads/branch-1.8 from MikeHurleySurescripts
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=6970d2f ]

AVRO-1858 add tojson head mode (#100)

* AVRO-1858: added --head option to the tojson operation

* AVRO-1858: added unit tests for tojson --head option

* AVRO-1858: head input and record counters are now longs

* AVRO-1858: added tojson --head tests for zero and negative values. Negative 
head count is now an error.


> Update DataFileReadTool (tojson) to support a "head" concept
> 
>
> Key: AVRO-1858
> URL: https://issues.apache.org/jira/browse/AVRO-1858
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Mike Hurley
>Assignee: Mike Hurley
>Priority: Major
> Fix For: 1.9.0
>
>
> It would be nice if the tojson operator supported a "head" concept in order 
> to get a sampling of records in an Avro file.
> Allow specifying a maximum record count to display. If no max is given in 
> head mode, use a reasonable default (like 10).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1891) Generated Java code fails with union containing logical type

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1671#comment-1671
 ] 

ASF GitHub Bot commented on AVRO-1891:
--

dkulp commented on issue #118: AVRO-1891: Fix specific nested logical types
URL: https://github.com/apache/avro/pull/118#issuecomment-446323747
 
 
   I merged #329, is this not needed now?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Generated Java code fails with union containing logical type
> 
>
> Key: AVRO-1891
> URL: https://issues.apache.org/jira/browse/AVRO-1891
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java, logical types
>Affects Versions: 1.8.1
>Reporter: Ross Black
>Priority: Blocker
> Fix For: 1.8.3
>
> Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, 
> AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch
>
>
> Example schema:
> {code}
> {
>   "type": "record",
>   "name": "RecordV1",
>   "namespace": "org.brasslock.event",
>   "fields": [
> { "name": "first", "type": ["null", {"type": "long", 
> "logicalType":"timestamp-millis"}]}
>   ]
> }
> {code}
> The avro compiler generates a field using the relevant joda class:
> {code}
> public org.joda.time.DateTime first
> {code}
> Running the following code to perform encoding:
> {code}
> final RecordV1 record = new 
> RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z"));
> final DatumWriter datumWriter = new 
> SpecificDatumWriter<>(record.getSchema());
> final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192);
> final BinaryEncoder encoder = 
> EncoderFactory.get().directBinaryEncoder(stream, null);
> datumWriter.write(record, encoder);
> encoder.flush();
> final byte[] bytes = stream.toByteArray();
> {code}
> fails with the exception stacktrace:
> {code}
>  org.apache.avro.AvroRuntimeException: Unknown datum type 
> org.joda.time.DateTime: 2016-07-29T10:15:30.000Z
> at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741)
> at 
> org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293)
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706)
> at 
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
> at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
> at 
> org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87)
> at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
> at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
> at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
> at 
> org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at 
> 

[jira] [Commented] (AVRO-1891) Generated Java code fails with union containing logical type

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717769#comment-16717769
 ] 

ASF GitHub Bot commented on AVRO-1891:
--

dkulp closed pull request #329: Improved conversions handling + pluggable 
conversions support [AVRO-1891, AVRO-2065]
URL: https://github.com/apache/avro/pull/329
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/data/RecordBuilderBase.java 
b/lang/java/avro/src/main/java/org/apache/avro/data/RecordBuilderBase.java
index 106c500b4..6d2f4c19e 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/data/RecordBuilderBase.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/data/RecordBuilderBase.java
@@ -17,19 +17,16 @@
  */
 package org.apache.avro.data;
 
-import java.io.IOException;
-import java.util.Arrays;
-
 import org.apache.avro.AvroRuntimeException;
-import org.apache.avro.Conversion;
-import org.apache.avro.Conversions;
-import org.apache.avro.LogicalType;
 import org.apache.avro.Schema;
 import org.apache.avro.Schema.Field;
 import org.apache.avro.Schema.Type;
 import org.apache.avro.generic.GenericData;
 import org.apache.avro.generic.IndexedRecord;
 
+import java.io.IOException;
+import java.util.Arrays;
+
 /** Abstract base class for RecordBuilder implementations.  Not thread-safe. */
 public abstract class RecordBuilderBase
   implements RecordBuilder {
@@ -138,29 +135,6 @@ protected Object defaultValue(Field field) throws 
IOException {
 return data.deepCopy(field.schema(), data.getDefaultValue(field));
   }
 
-  /**
-   * Gets the default value of the given field, if any. Pass in a conversion
-   * to convert data to logical type class. Please make sure the schema does
-   * have a logical type, otherwise an exception would be thrown out.
-   * @param field the field whose default value should be retrieved.
-   * @param conversion the tool to convert data to logical type class
-   * @return the default value associated with the given field,
-   * or null if none is specified in the schema.
-   * @throws IOException
-   */
-  @SuppressWarnings({ "rawtypes", "unchecked" })
-  protected Object defaultValue(Field field, Conversion conversion) throws 
IOException {
-Schema schema = field.schema();
-LogicalType logicalType = schema.getLogicalType();
-Object rawDefaultValue = data.deepCopy(schema, 
data.getDefaultValue(field));
-if (conversion == null || logicalType == null) {
-  return rawDefaultValue;
-} else {
-  return Conversions.convertToLogicalType(rawDefaultValue, schema,
-  logicalType, conversion);
-}
-  }
-
   @Override
   public int hashCode() {
 final int prime = 31;
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java 
b/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
index 6dffa15c5..7294192f3 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
@@ -105,6 +105,10 @@ public GenericData(ClassLoader classLoader) {
   private Map, Map>> conversionsByClass =
   new IdentityHashMap<>();
 
+  public Collection> getConversions() {
+return conversions.values();
+  }
+
   /**
* Registers the given conversion to be used when reading and writing with
* this data model.
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java 
b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java
index d7b2bf825..c4388caaa 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java
@@ -17,6 +17,7 @@
  */
 package org.apache.avro.specific;
 
+import java.lang.reflect.Field;
 import java.util.Arrays;
 import java.util.HashSet;
 import java.util.Map;
@@ -132,6 +133,55 @@ public DatumWriter createDatumWriter(Schema schema) {
   /** Return the singleton instance. */
   public static SpecificData get() { return INSTANCE; }
 
+  /**
+   * For RECORD type schemas, this method returns the SpecificData instance of 
the class associated with the schema,
+   * in order to get the right conversions for any logical types used.
+   *
+   * @param reader the reader schema
+   * @return the SpecificData associated with the schema's class, or the 
default instance.
+   */
+  public static SpecificData getForSchema(Schema reader) {
+if (reader.getType() == Type.RECORD) {
+  final String className = getClassName(reader);
+  if (className != null) {
+final Class clazz;
+try {
+  clazz = Class.forName(className);

[jira] [Commented] (AVRO-2065) Avro java code generation for Unions doesn't set converters for unions

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717771#comment-16717771
 ] 

ASF subversion and git services commented on AVRO-2065:
---

Commit 7ed38d7c7e150987ef8bf035196576fc158e03eb in avro's branch 
refs/heads/master from Katrin Skoglund
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=7ed38d7 ]

Improved conversions handling + pluggable conversions support [AVRO-1891, 
AVRO-2065] (#329)

* Added end-to-end test that reproduces union with logical types problem

* Adding required conversions to SpecificData in generated class
(same as in SpecificCompiler)

* Added test with BigDecimal

* Added test with BigDecimal

* Introduced customizable conversions in compiler and Maven plugin.

* Fixed bug

* Fixed Maven plugin classpath

* Get the correct SpecificData whenever possible, to get the right conversions

* No need to expose the map of conversions so expose only the values.

* Better tests

* Default values and conversions

* Cleanup of some changes in Maven plugin

* Fixed equals() for classes with nested logical types. Improved tests

* Added missing copyright statement

* Fixed compile error after rebase

* Fixed problem with logical types in nested records.

* Fixed failing test.

* Fixed serialization problem when creating SpecificDatumReader from a class


> Avro java code generation for Unions doesn't set converters for unions
> --
>
> Key: AVRO-2065
> URL: https://issues.apache.org/jira/browse/AVRO-2065
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Stephane Maarek
>Priority: Blocker
> Fix For: 1.8.3
>
>
> The full issue is here: 
> https://stackoverflow.com/questions/45581437/how-to-specify-converter-for-default-value-in-avro-union-logical-type-fields
> {code}
> {
>  "name":"my_optional_date",
>  "type":[
> {
>"type":"long",
>"logicalType":"timestamp-millis"
> },
> "null"
>  ],
>  "default":1502250227187
>   }
> {code}
> Doesn't generate the right conversions
> {code}
> private static final org.apache.avro.Conversion[] conversions =
>   new org.apache.avro.Conversion[] {
>   null,  // <-- THIS ONE IS NOT SET PROPERLY, should be 
> TIMESTAMP_CONVERSION
>   null
>   };
> {code}
> The code fails on 
> {code}
> BuggyRecord.Builder buggyRecordBuilder = BuggyRecord.newBuilder();
> buggyRecordBuilder.build();
> {code}
> with 
> {code}
> org.apache.avro.AvroRuntimeException: java.lang.ClassCastException: 
> java.lang.Long cannot be cast to org.joda.time.DateTime
> at com.example.BuggyRecord$Builder.build(BuggyRecord.java:301)
> at BuggyRecordTest.Foo(BuggyRecordTest.java:10)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> org.joda.time.DateTime
> at 

[jira] [Resolved] (AVRO-1777) Select best matching record when writing a union in python

2018-12-11 Thread Daniel Kulp (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Kulp resolved AVRO-1777.
---
   Resolution: Fixed
 Assignee: Daniel Kulp
Fix Version/s: 1.9.0

> Select best matching record when writing a union in python
> --
>
> Key: AVRO-1777
> URL: https://issues.apache.org/jira/browse/AVRO-1777
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Affects Versions: 1.7.7
>Reporter: Steven Aerts
>Assignee: Daniel Kulp
>Priority: Major
> Fix For: 1.9.0
>
>
> Unlike javascript, python is not using wrapped types.
> So when writing a union it needs to guess find out which type it will output.
> At the moment it takes the last validating type.
> I propose to take the type with the most matching fields.
> So I propose to change in {{io.py}}:
> {code}
> # resolve union
> index_of_schema = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> index_of_schema = i
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> into
> {code}
> # resolve union
> index_of_schema = -1
> found_fields = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> nr_fields = candidate_schema.type in ['record', 'error', 'request'] and 
> len(candidate_schema.fields) or 1
> if nr_fields > found_fields:
>   index_of_schema = i
>   found_fields = nr_fields
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> If you want, I can create a pull request for this.  And apply it both on py3 
> as py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2075) Allow SchemaCompatibility to report possibly lossy conversions

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717721#comment-16717721
 ] 

ASF GitHub Bot commented on AVRO-2075:
--

epkanol commented on issue #246: AVRO-2075: Add option to report possible data 
loss in SchemaCompatibi…
URL: https://github.com/apache/avro/pull/246#issuecomment-446309912
 
 
   Ping @dkulp , thanks for pointing this out - I had forgotten about this PR, 
I will take a look as soon as time permits (most likely this week, or at the 
latest next). It was some time since I did Avro contributions, though, so there 
will be some initial set-up (e.g. formatting issues - I have a new environment 
now...)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow SchemaCompatibility to report possibly lossy conversions
> --
>
> Key: AVRO-2075
> URL: https://issues.apache.org/jira/browse/AVRO-2075
> Project: Apache Avro
>  Issue Type: Improvement
>Affects Versions: 1.7.7, 1.8.2
> Environment: Java
>Reporter: Anders Sundelin
>Assignee: Anders Sundelin
>Priority: Minor
> Attachments: 
> 0001-AVRO-2075-Add-option-to-report-possible-data-loss-in.patch
>
>
> It is stated in the Avro spec that int and long values are promotable to 
> floats and doubles.
> However, numeric promotions to floats are lossy (losing precision), as is 
> long promotion to double.
> It is suggested that the SchemaCompatibility class is updated to be able to 
> flag conversions that have the possibility to be lossy as errors. The 
> attached patch does just that, by adding a new boolean flag (allowDataLoss), 
> preserving backwards compatibility by defaulting this flag to true.
> Testcases illustrating the problem has been added to the unit test class 
> TestReadingWritingDataInEvolvedSchemas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1777) Select best matching record when writing a union in python

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717713#comment-16717713
 ] 

ASF GitHub Bot commented on AVRO-1777:
--

dkulp closed pull request #95: AVRO-1777: Select best matching record when 
writing a union in python
URL: https://github.com/apache/avro/pull/95
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/lang/py/src/avro/io.py b/lang/py/src/avro/io.py
index b2fd2f9ba..63907e17c 100644
--- a/lang/py/src/avro/io.py
+++ b/lang/py/src/avro/io.py
@@ -94,6 +94,10 @@ def __init__(self, fail_msg, writers_schema=None, 
readers_schema=None):
 if readers_schema: fail_msg += "\nReader's Schema: %s" % pretty_readers
 schema.AvroException.__init__(self, fail_msg)
 
+class RecordInitializationException(schema.AvroException):
+def __init__(self, fail_msg):
+schema.AvroException.__init__(self, fail_msg)
+
 #
 # Validate
 #
@@ -110,14 +114,17 @@ def validate(expected_schema, datum):
   elif schema_type == 'bytes':
 return isinstance(datum, str)
   elif schema_type == 'int':
-return ((isinstance(datum, int) or isinstance(datum, long)) 
-and INT_MIN_VALUE <= datum <= INT_MAX_VALUE)
+return (((isinstance(datum, int) and not isinstance(datum, bool)) or
+isinstance(datum, long)) and
+INT_MIN_VALUE <= datum <= INT_MAX_VALUE)
   elif schema_type == 'long':
-return ((isinstance(datum, int) or isinstance(datum, long)) 
-and LONG_MIN_VALUE <= datum <= LONG_MAX_VALUE)
+return (((isinstance(datum, int) and not isinstance(datum, bool)) or
+isinstance(datum, long)) and
+LONG_MIN_VALUE <= datum <= LONG_MAX_VALUE)
   elif schema_type in ['float', 'double']:
-return (isinstance(datum, int) or isinstance(datum, long)
-or isinstance(datum, float))
+return (isinstance(datum, long) or
+(isinstance(datum, int) and not isinstance(datum, bool)) or
+isinstance(datum, float))
   elif schema_type == 'fixed':
 return isinstance(datum, str) and len(datum) == expected_schema.size
   elif schema_type == 'enum':
@@ -132,6 +139,8 @@ def validate(expected_schema, datum):
 [validate(expected_schema.values, v) for v in datum.values()])
   elif schema_type in ['union', 'error_union']:
 return True in [validate(s, datum) for s in expected_schema.schemas]
+  elif schema_type == 'record' and isinstance(datum, GenericRecord):
+  return expected_schema == datum.schema
   elif schema_type in ['record', 'error', 'request']:
 return (isinstance(datum, dict) and
   False not in
@@ -683,7 +692,7 @@ def read_record(self, writers_schema, readers_schema, 
decoder):
 """
 # schema resolution
 readers_fields_dict = readers_schema.fields_dict
-read_record = {}
+read_record = GenericRecord(readers_schema)
 for field in writers_schema.fields:
   readers_field = readers_fields_dict.get(field.name)
   if readers_field is not None:
@@ -888,3 +897,23 @@ def write_record(self, writers_schema, datum, encoder):
 """
 for field in writers_schema.fields:
   self.write_data(field.type, datum.get(field.name), encoder)
+
+class GenericRecord(dict):
+
+def __init__(self, record_schema, lst = []):
+if (record_schema is None or
+not isinstance(record_schema, schema.Schema)):
+raise RecordInitializationException(
+"Cannot initialize a record with schema: {sc}".format(sc = 
record_schema))
+dict.__init__(self, lst)
+self.schema = record_schema
+
+def __eq__(self, other):
+if other is None or not isinstance(other, dict):
+return False
+if not dict.__eq__(self, other):
+return False
+if isinstance(other, GenericRecord):
+return self.schema == other.schema
+else:
+return True
diff --git a/lang/py/test/test_io.py b/lang/py/test/test_io.py
index 1e79d3e89..d6e341a47 100644
--- a/lang/py/test/test_io.py
+++ b/lang/py/test/test_io.py
@@ -39,6 +39,8 @@
   ('{"type": "array", "items": "long"}', [1, 3, 2]),
   ('{"type": "map", "values": "long"}', {'a': 1, 'b': 3, 'c': 2}),
   ('["string", "null", "long"]', None),
+  ('["double", "boolean"]', True),
+  ('["boolean", "double"]', True),
   ("""\
{"type": "record",
 "name": "Test",
@@ -190,6 +192,13 @@ def test_validate(self):
   def test_round_trip(self):
 print_test_name('TEST ROUND TRIP')
 correct = 0
+def are_equal(datum, round_trip_datum):
+if datum != round_trip_datum:
+return False
+if type(datum) == bool:
+return type(round_trip_datum) == bool
+ 

[jira] [Commented] (AVRO-1777) Select best matching record when writing a union in python

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717714#comment-16717714
 ] 

ASF subversion and git services commented on AVRO-1777:
---

Commit beef86697a0dbd09d4d99a735f8a5afe37e5d976 in avro's branch 
refs/heads/master from [~shiraeeshi]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=beef866 ]

AVRO-1777: Select best matching record when writing a union in python (#95)

* make numeric schemas not valid for boolean datums

* fix formatting

* fix int and long schemas

* fix brackets

* fix brackets

* add GenericRecord type


> Select best matching record when writing a union in python
> --
>
> Key: AVRO-1777
> URL: https://issues.apache.org/jira/browse/AVRO-1777
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Affects Versions: 1.7.7
>Reporter: Steven Aerts
>Priority: Major
> Fix For: 1.9.0
>
>
> Unlike javascript, python is not using wrapped types.
> So when writing a union it needs to guess find out which type it will output.
> At the moment it takes the last validating type.
> I propose to take the type with the most matching fields.
> So I propose to change in {{io.py}}:
> {code}
> # resolve union
> index_of_schema = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> index_of_schema = i
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> into
> {code}
> # resolve union
> index_of_schema = -1
> found_fields = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> nr_fields = candidate_schema.type in ['record', 'error', 'request'] and 
> len(candidate_schema.fields) or 1
> if nr_fields > found_fields:
>   index_of_schema = i
>   found_fields = nr_fields
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> If you want, I can create a pull request for this.  And apply it both on py3 
> as py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread Daniel Kulp (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Kulp resolved AVRO-2034.
---
   Resolution: Fixed
 Assignee: Daniel Kulp
Fix Version/s: 1.9.0

> Nested schema types with unexpected fields causes json parse failure
> 
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Todd Nine
>Assignee: Daniel Kulp
>Priority: Major
> Fix For: 1.9.0
>
>
> When parsing a nested type with an unexpected field using the JSON parser, 
> this results in an error.  To reproduce, see the class {{TestNestedRecords}} 
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field 
> appears to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717694#comment-16717694
 ] 

ASF subversion and git services commented on AVRO-2034:
---

Commit 254ee8ff595c6c52580128aec9355394f96382d5 in avro's branch 
refs/heads/master from [~dkulp]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=254ee8f ]

[AVRO-2034] Remove conditions that will always be true/false


> Nested schema types with unexpected fields causes json parse failure
> 
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Todd Nine
>Priority: Major
> Fix For: 1.9.0
>
>
> When parsing a nested type with an unexpected field using the JSON parser, 
> this results in an error.  To reproduce, see the class {{TestNestedRecords}} 
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field 
> appears to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717685#comment-16717685
 ] 

ASF subversion and git services commented on AVRO-2034:
---

Commit d55f5e152c288a2037d65d15a7169d76aa9be2be in avro's branch 
refs/heads/master from [~tnine]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d55f5e1 ]

AVRO-2034 Nested schema types with unexpected fields causes json parse failure 
(#224)

* AVRO-2034 Created test to prove issue

* AVRO-2034. Updates test to show a working record vs a failing record in the 
simplest possible scheme.

* AVRO-2034 Fixes advance logic to skip unrecognized fields at record end


> Nested schema types with unexpected fields causes json parse failure
> 
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Todd Nine
>Priority: Major
>
> When parsing a nested type with an unexpected field using the JSON parser, 
> this results in an error.  To reproduce, see the class {{TestNestedRecords}} 
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field 
> appears to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717681#comment-16717681
 ] 

ASF GitHub Bot commented on AVRO-2034:
--

dkulp closed pull request #224: AVRO-2034 Nested schema types with unexpected 
fields causes json parse failure
URL: https://github.com/apache/avro/pull/224
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java 
b/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java
index 78fafaa83..cd2742453 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java
@@ -494,14 +494,23 @@ public Symbol doAction(Symbol input, Symbol top) throws 
IOException {
 throw error("record-start");
   }
 } else if (top == Symbol.RECORD_END || top == Symbol.UNION_END) {
-  if (in.getCurrentToken() == JsonToken.END_OBJECT) {
+  //AVRO-2034 advance to the end of our object
+  while(in.getCurrentToken() != JsonToken.END_OBJECT){
 in.nextToken();
+  }
+
+  if (in.getCurrentToken() == JsonToken.END_OBJECT) {
+
 if (top == Symbol.RECORD_END) {
   if (currentReorderBuffer != null && 
!currentReorderBuffer.savedFields.isEmpty()) {
 throw error("Unknown fields: " + 
currentReorderBuffer.savedFields.keySet());
   }
   currentReorderBuffer = reorderBuffers.pop();
 }
+
+//AVRO-2034 advance beyond the end object for the next record.
+in.nextToken();
+
   } else {
 throw error(top == Symbol.RECORD_END ? "record-end" : "union-end");
   }
diff --git 
a/lang/java/avro/src/test/java/org/apache/avro/TestNestedRecords.java 
b/lang/java/avro/src/test/java/org/apache/avro/TestNestedRecords.java
new file mode 100644
index 0..8900b1ee9
--- /dev/null
+++ b/lang/java/avro/src/test/java/org/apache/avro/TestNestedRecords.java
@@ -0,0 +1,110 @@
+package org.apache.avro;
+
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericDatumReader;
+import org.apache.avro.io.DatumReader;
+import org.apache.avro.io.DecoderFactory;
+import org.apache.avro.io.JsonDecoder;
+import org.junit.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+
+import static org.hamcrest.CoreMatchers.equalTo;
+import static org.junit.Assert.assertThat;
+
+/**
+ * This test demonstrates the fix for a complex nested schema type.
+ */
+public class TestNestedRecords {
+
+
+  @Test
+  public void testSingleSubRecord() throws IOException {
+
+final Schema child = SchemaBuilder.record("Child")
+.namespace("org.apache.avro.nested")
+.fields()
+.requiredString("childField").endRecord();
+
+
+final Schema parent = SchemaBuilder.record("Parent")
+.namespace("org.apache.avro.nested")
+.fields()
+.requiredString("parentField1")
+.name("child1").type(child).noDefault()
+.requiredString("parentField2").endRecord();
+
+
+
+final String inputAsExpected = "{\n" +
+" \"parentField1\": \"parentValue1\",\n" +
+" \"child1\":{\n" +
+"\"childField\":\"childValue1\"\n" +
+" },\n" +
+" \"parentField2\":\"parentValue2\"\n" +
+"}";
+
+
+final ByteArrayInputStream inputStream = new 
ByteArrayInputStream(inputAsExpected.getBytes());
+
+final JsonDecoder decoder = DecoderFactory.get().jsonDecoder(parent, 
inputStream);
+final DatumReader reader = new GenericDatumReader(parent);
+
+final GenericData.Record  decoded = (GenericData.Record) reader.read(null, 
decoder);
+
+
+assertThat(decoded.get("parentField1").toString(), 
equalTo("parentValue1"));
+assertThat(decoded.get("parentField2").toString(), 
equalTo("parentValue2"));
+
+
assertThat(((GenericData.Record)decoded.get("child1")).get("childField").toString(),
 equalTo("childValue1"));
+
+  }
+
+
+
+  @Test
+  public void testSingleSubRecordExtraField() throws IOException {
+
+final Schema child = SchemaBuilder.record("Child")
+.namespace("org.apache.avro.nested")
+.fields()
+.requiredString("childField").endRecord();
+
+
+final Schema parent = SchemaBuilder.record("Parent")
+.namespace("org.apache.avro.nested")
+.fields()
+.requiredString("parentField1")
+.name("child1").type(child).noDefault()
+.requiredString("parentField2").endRecord();
+
+
+final String inputAsExpected = "{\n" +
+" \"parentField1\": \"parentValue1\",\n" +
+  

[jira] [Commented] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717683#comment-16717683
 ] 

ASF subversion and git services commented on AVRO-2034:
---

Commit d55f5e152c288a2037d65d15a7169d76aa9be2be in avro's branch 
refs/heads/master from [~tnine]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d55f5e1 ]

AVRO-2034 Nested schema types with unexpected fields causes json parse failure 
(#224)

* AVRO-2034 Created test to prove issue

* AVRO-2034. Updates test to show a working record vs a failing record in the 
simplest possible scheme.

* AVRO-2034 Fixes advance logic to skip unrecognized fields at record end


> Nested schema types with unexpected fields causes json parse failure
> 
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Todd Nine
>Priority: Major
>
> When parsing a nested type with an unexpected field using the JSON parser, 
> this results in an error.  To reproduce, see the class {{TestNestedRecords}} 
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field 
> appears to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717686#comment-16717686
 ] 

ASF subversion and git services commented on AVRO-2034:
---

Commit d55f5e152c288a2037d65d15a7169d76aa9be2be in avro's branch 
refs/heads/master from [~tnine]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d55f5e1 ]

AVRO-2034 Nested schema types with unexpected fields causes json parse failure 
(#224)

* AVRO-2034 Created test to prove issue

* AVRO-2034. Updates test to show a working record vs a failing record in the 
simplest possible scheme.

* AVRO-2034 Fixes advance logic to skip unrecognized fields at record end


> Nested schema types with unexpected fields causes json parse failure
> 
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Todd Nine
>Priority: Major
>
> When parsing a nested type with an unexpected field using the JSON parser, 
> this results in an error.  To reproduce, see the class {{TestNestedRecords}} 
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field 
> appears to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2034) Nested schema types with unexpected fields causes json parse failure

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717682#comment-16717682
 ] 

ASF subversion and git services commented on AVRO-2034:
---

Commit d55f5e152c288a2037d65d15a7169d76aa9be2be in avro's branch 
refs/heads/master from [~tnine]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=d55f5e1 ]

AVRO-2034 Nested schema types with unexpected fields causes json parse failure 
(#224)

* AVRO-2034 Created test to prove issue

* AVRO-2034. Updates test to show a working record vs a failing record in the 
simplest possible scheme.

* AVRO-2034 Fixes advance logic to skip unrecognized fields at record end


> Nested schema types with unexpected fields causes json parse failure
> 
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Todd Nine
>Priority: Major
>
> When parsing a nested type with an unexpected field using the JSON parser, 
> this results in an error.  To reproduce, see the class {{TestNestedRecords}} 
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field 
> appears to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2184) Unable to decode JSON data file if a property is renamed in reader schema

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717679#comment-16717679
 ] 

ASF subversion and git services commented on AVRO-2184:
---

Commit 595643cba16a2b4d7be68d46ee8f79c4e380cbf7 in avro's branch 
refs/heads/master from nandorKollar
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=595643c ]

AVRO-2184: Unable to decode JSON data file if a property is renamed in reader 
schema (#316)

* AVRO-2184: Unable to decode JSON data file if a property is renamed in reader 
schema

JsonDecoder doesn't honor aliases

* No need to wrap aliases to unmodifiableSet, since getter Schema#aliases 
already does it

* Remove unused import to pass Checkstyle check


> Unable to decode JSON data file if a property is renamed in reader schema
> -
>
> Key: AVRO-2184
> URL: https://issues.apache.org/jira/browse/AVRO-2184
> Project: Apache Avro
>  Issue Type: Bug
>Reporter: Prateek Kohli
>Assignee: Nandor Kollar
>Priority: Major
> Attachments: TestAliasesInSchemaEvolution.java
>
>
> I am unable to decode JSON data file if a property is renamed in reader 
> schema:
> As per the documentation it is a compatible change.
> Also, Datatype promotion is not being supported, if I try to change the 
> datatype of favourite_number field in the writer's schema, decoding fails.
> All of the above scenarios are supported if I use Binary decoding instead of 
> JSON.
> *Writer Schema :*
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "User",
>  "fields": [
>  \{"name": "name", "type": "string"},
>  \{"name": "favorite_number", "type": ["int", "null"]},
>  \{"name": "favorite_color", "type": ["string", "null"]}
>  ]}
>  
> *Reader Schema :* 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "User",
>  "fields": [
>  \{"name": "fname", "type": "string", "aliases" : [ "name" ]},
>  \{"name": "favorite_number", "type": ["int", "null"]},
>  \{"name": "favorite_color", "type": ["string", "null"]}
>  ]}
>  
> *I have written the below code to decode JSON data:*
> FileInputStream fin = new FileInputStream(file);
>  byte fileContent[] = new byte[(int)file.length()];
>  fin.read(fileContent);
>  InputStream input = new ByteArrayInputStream(fileContent);
>  DataInputStream din = new DataInputStream(input);
>  
>  while (true) {
>  try {
>          Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
>          ResolvingDecoder resolvingDecoder = 
> DecoderFactory.get().resolvingDecoder(writer,                             
> reader, decoder);
>          Object datum = datumReader.read(null, resolvingDecoder);
>          System.out.println(datum);
>      } catch (EOFException eofException) {
>           break;
>        }
>  }
> *Below is the Exception I get :*
> Exception in thread "main" org.apache.avro.AvroTypeException: Found 
> example.avro.User, expecting example.avro.User, missing required field fname
>  at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
>  at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>  at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:196)
>  at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201)
>  at 
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:422)
>  at 
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:414)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:181)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
>  at 
> com.ericsson.avroTest.avroCheck.WithoutCodeTest.main(WithoutCodeTest.java:134)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2184) Unable to decode JSON data file if a property is renamed in reader schema

2018-12-11 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717678#comment-16717678
 ] 

ASF subversion and git services commented on AVRO-2184:
---

Commit 595643cba16a2b4d7be68d46ee8f79c4e380cbf7 in avro's branch 
refs/heads/master from nandorKollar
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=595643c ]

AVRO-2184: Unable to decode JSON data file if a property is renamed in reader 
schema (#316)

* AVRO-2184: Unable to decode JSON data file if a property is renamed in reader 
schema

JsonDecoder doesn't honor aliases

* No need to wrap aliases to unmodifiableSet, since getter Schema#aliases 
already does it

* Remove unused import to pass Checkstyle check


> Unable to decode JSON data file if a property is renamed in reader schema
> -
>
> Key: AVRO-2184
> URL: https://issues.apache.org/jira/browse/AVRO-2184
> Project: Apache Avro
>  Issue Type: Bug
>Reporter: Prateek Kohli
>Assignee: Nandor Kollar
>Priority: Major
> Attachments: TestAliasesInSchemaEvolution.java
>
>
> I am unable to decode JSON data file if a property is renamed in reader 
> schema:
> As per the documentation it is a compatible change.
> Also, Datatype promotion is not being supported, if I try to change the 
> datatype of favourite_number field in the writer's schema, decoding fails.
> All of the above scenarios are supported if I use Binary decoding instead of 
> JSON.
> *Writer Schema :*
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "User",
>  "fields": [
>  \{"name": "name", "type": "string"},
>  \{"name": "favorite_number", "type": ["int", "null"]},
>  \{"name": "favorite_color", "type": ["string", "null"]}
>  ]}
>  
> *Reader Schema :* 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "User",
>  "fields": [
>  \{"name": "fname", "type": "string", "aliases" : [ "name" ]},
>  \{"name": "favorite_number", "type": ["int", "null"]},
>  \{"name": "favorite_color", "type": ["string", "null"]}
>  ]}
>  
> *I have written the below code to decode JSON data:*
> FileInputStream fin = new FileInputStream(file);
>  byte fileContent[] = new byte[(int)file.length()];
>  fin.read(fileContent);
>  InputStream input = new ByteArrayInputStream(fileContent);
>  DataInputStream din = new DataInputStream(input);
>  
>  while (true) {
>  try {
>          Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
>          ResolvingDecoder resolvingDecoder = 
> DecoderFactory.get().resolvingDecoder(writer,                             
> reader, decoder);
>          Object datum = datumReader.read(null, resolvingDecoder);
>          System.out.println(datum);
>      } catch (EOFException eofException) {
>           break;
>        }
>  }
> *Below is the Exception I get :*
> Exception in thread "main" org.apache.avro.AvroTypeException: Found 
> example.avro.User, expecting example.avro.User, missing required field fname
>  at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292)
>  at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>  at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:196)
>  at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201)
>  at 
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:422)
>  at 
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:414)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:181)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
>  at 
> com.ericsson.avroTest.avroCheck.WithoutCodeTest.main(WithoutCodeTest.java:134)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2184) Unable to decode JSON data file if a property is renamed in reader schema

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717676#comment-16717676
 ] 

ASF GitHub Bot commented on AVRO-2184:
--

dkulp closed pull request #316: AVRO-2184: Unable to decode JSON data file if a 
property is renamed in reader schema
URL: https://github.com/apache/avro/pull/316
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java 
b/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java
index 1dcb8dd15..21cea4402 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java
@@ -469,7 +469,7 @@ public Symbol doAction(Symbol input, Symbol top) throws 
IOException {
 do {
   String fn = in.getText();
   in.nextToken();
-  if (name.equals(fn)) {
+  if (name.equals(fn) || fa.aliases.contains(fn)) {
 return null;
   } else {
 if (currentReorderBuffer == null) {
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/io/parsing/JsonGrammarGenerator.java
 
b/lang/java/avro/src/main/java/org/apache/avro/io/parsing/JsonGrammarGenerator.java
index 505c09423..44fc19b08 100644
--- 
a/lang/java/avro/src/main/java/org/apache/avro/io/parsing/JsonGrammarGenerator.java
+++ 
b/lang/java/avro/src/main/java/org/apache/avro/io/parsing/JsonGrammarGenerator.java
@@ -84,7 +84,7 @@ public Symbol generate(Schema sc, Map seen) {
 int n = 0;
 production[--i] = Symbol.RECORD_START;
 for (Field f : sc.getFields()) {
-  production[--i] = Symbol.fieldAdjustAction(n, f.name());
+  production[--i] = Symbol.fieldAdjustAction(n, f.name(), f.aliases());
   production[--i] = generate(f.schema(), seen);
   production[--i] = Symbol.FIELD_END;
   n++;
diff --git 
a/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Symbol.java 
b/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Symbol.java
index 187942400..df0ee4652 100644
--- a/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Symbol.java
+++ b/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Symbol.java
@@ -541,16 +541,18 @@ public SkipAction flatten(Map map,
 
   }
 
-  public static FieldAdjustAction fieldAdjustAction(int rindex, String fname) {
-return new FieldAdjustAction(rindex, fname);
+  public static FieldAdjustAction fieldAdjustAction(int rindex, String fname, 
Set aliases) {
+return new FieldAdjustAction(rindex, fname, aliases);
   }
 
   public static class FieldAdjustAction extends ImplicitAction {
 public final int rindex;
 public final String fname;
-@Deprecated public FieldAdjustAction(int rindex, String fname) {
+public final Set aliases;
+@Deprecated public FieldAdjustAction(int rindex, String fname, Set 
aliases) {
   this.rindex = rindex;
   this.fname = fname;
+  this.aliases = aliases;
 }
   }
 
diff --git 
a/lang/java/avro/src/test/java/org/apache/avro/TestReadingWritingDataInEvolvedSchemas.java
 
b/lang/java/avro/src/test/java/org/apache/avro/TestReadingWritingDataInEvolvedSchemas.java
index 85a3ca705..c4ea7e741 100644
--- 
a/lang/java/avro/src/test/java/org/apache/avro/TestReadingWritingDataInEvolvedSchemas.java
+++ 
b/lang/java/avro/src/test/java/org/apache/avro/TestReadingWritingDataInEvolvedSchemas.java
@@ -21,9 +21,12 @@
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertNull;
 
+import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
 import java.io.IOException;
 import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collection;
 
 import org.apache.avro.generic.GenericData;
 import org.apache.avro.generic.GenericData.EnumSymbol;
@@ -39,7 +42,11 @@
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import org.junit.runners.Parameterized.Parameters;
 
+@RunWith(Parameterized.class)
 public class TestReadingWritingDataInEvolvedSchemas {
 
   private static final String RECORD_A = "RecordA";
@@ -116,6 +123,23 @@
   
.name(FIELD_A).type().unionOf().floatType().and().doubleType().endUnion().noDefault()
 //
   .endRecord();
 
+  @Parameters(name = "encoder = {0}")
+  public static Collection data() {
+return Arrays.asList(new EncoderType[][]{
+  {EncoderType.BINARY}, {EncoderType.JSON}
+});
+  }
+
+  public TestReadingWritingDataInEvolvedSchemas(EncoderType encoderType) {
+this.encoderType = encoderType;
+  }
+
+  private final EncoderType 

[jira] [Commented] (AVRO-2075) Allow SchemaCompatibility to report possibly lossy conversions

2018-12-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16717669#comment-16717669
 ] 

ASF GitHub Bot commented on AVRO-2075:
--

dkulp commented on issue #246: AVRO-2075: Add option to report possible data 
loss in SchemaCompatibi…
URL: https://github.com/apache/avro/pull/246#issuecomment-446301010
 
 
   Can this please be rebased on  current master and updated.   Bunch of 
conflicts that would need to be resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow SchemaCompatibility to report possibly lossy conversions
> --
>
> Key: AVRO-2075
> URL: https://issues.apache.org/jira/browse/AVRO-2075
> Project: Apache Avro
>  Issue Type: Improvement
>Affects Versions: 1.7.7, 1.8.2
> Environment: Java
>Reporter: Anders Sundelin
>Assignee: Anders Sundelin
>Priority: Minor
> Attachments: 
> 0001-AVRO-2075-Add-option-to-report-possible-data-loss-in.patch
>
>
> It is stated in the Avro spec that int and long values are promotable to 
> floats and doubles.
> However, numeric promotions to floats are lossy (losing precision), as is 
> long promotion to double.
> It is suggested that the SchemaCompatibility class is updated to be able to 
> flag conversions that have the possibility to be lossy as errors. The 
> attached patch does just that, by adding a new boolean flag (allowDataLoss), 
> preserving backwards compatibility by defaulting this flag to true.
> Testcases illustrating the problem has been added to the unit test class 
> TestReadingWritingDataInEvolvedSchemas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)