[jira] [Commented] (AVRO-1705) Set up Jenkins job to test all languages using Docker

2016-01-20 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108811#comment-15108811
 ] 

Sean Busbey commented on AVRO-1705:
---

note that the current nightly CI builds are done on buildbot:  
http://ci.apache.org/

if/when we move to jenkins we should decommission the buildbot jobs.

> Set up Jenkins job to test all languages using Docker
> -
>
> Key: AVRO-1705
> URL: https://issues.apache.org/jira/browse/AVRO-1705
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.7.7
>Reporter: Tom White
>  Labels: starter
>
> The ASF Jenkins instance now supports Docker (BUILDS-25), so we could run all 
> the tests (for all languages that Avro supports) using the Avro Dockerfile. 
> We might also do a nightly build of the whole distribution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1705) Set up Jenkins job to test all languages using Docker

2016-01-20 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108814#comment-15108814
 ] 

Sean Busbey commented on AVRO-1705:
---

do we know if buildbot supports docker?

> Set up Jenkins job to test all languages using Docker
> -
>
> Key: AVRO-1705
> URL: https://issues.apache.org/jira/browse/AVRO-1705
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.7.7
>Reporter: Tom White
>  Labels: starter
>
> The ASF Jenkins instance now supports Docker (BUILDS-25), so we could run all 
> the tests (for all languages that Avro supports) using the Avro Dockerfile. 
> We might also do a nightly build of the whole distribution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1781) Schema.parse is not thread safe

2016-01-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107950#comment-15107950
 ] 

Sean Busbey commented on AVRO-1781:
---

+1

> Schema.parse is not thread safe
> ---
>
> Key: AVRO-1781
> URL: https://issues.apache.org/jira/browse/AVRO-1781
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Sean Busbey
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
> Attachments: AVRO-1781-ADDENDUM.1.patch, AVRO-1781-ADDENDUM.2.patch, 
> AVRO-1781.1.patch, AVRO-1781.2.patch
>
>
> Post AVRO-1497, Schema.parse calls {{LogicalTypes.fromSchemaIgnoreInvalid}} 
> on any schema that is expressed as a JSON object (anything except bare 
> primitives).
> That static method relies on a static cache based on WeakIdentityHashMap 
> (WIHM).
> WIHM clearly states that it isn't threadsafe 
> [ref|https://github.com/apache/avro/blob/branch-1.8/lang/java/avro/src/main/java/org/apache/avro/util/WeakIdentityHashMap.java#L42]
> {code}
>  * 
>  * Note that this implementation is not synchronized.
>  * 
>  */
> public class WeakIdentityHashMap implements Map {
> {code}
> All of the Schema.Parser instances use that same static Schema.parse method.
> The end result is that as-is it's only safe to have a single thread parsing 
> schemas in a given JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1783) Gracefully handle strings with wrong character encoding

2016-01-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107948#comment-15107948
 ] 

Sean Busbey commented on AVRO-1783:
---

hurm. I tried using this on my jruby 1.7.3 install and got a (different) error

{code}
busbey$ ruby ./sample_ipc_client.rb avro_user pat Hello_World
Errno::EPIPE: Broken pipe - Broken pipe
 write at org/jruby/RubyIO.java:1399
  write_buffer at 
/Users/busbey/.rvm/gems/jruby-1.7.3@AVRO-1783/gems/avro-1.9.0.pre1/lib/avro/ipc.rb:433
  write_framed_message at 
/Users/busbey/.rvm/gems/jruby-1.7.3@AVRO-1783/gems/avro-1.9.0.pre1/lib/avro/ipc.rb:421
transceive at 
/Users/busbey/.rvm/gems/jruby-1.7.3@AVRO-1783/gems/avro-1.9.0.pre1/lib/avro/ipc.rb:389
   request at 
/Users/busbey/.rvm/gems/jruby-1.7.3@AVRO-1783/gems/avro-1.9.0.pre1/lib/avro/ipc.rb:110
   request at 
/Users/busbey/.rvm/gems/jruby-1.7.3@AVRO-1783/gems/avro-1.9.0.pre1/lib/avro/ipc.rb:117
(root) at ./sample_ipc_client.rb:49
{code}

> Gracefully handle strings with wrong character encoding
> ---
>
> Key: AVRO-1783
> URL: https://issues.apache.org/jira/browse/AVRO-1783
> Project: Avro
>  Issue Type: Bug
>  Components: ruby
>Affects Versions: 1.7.7
>Reporter: Martin Kleppmann
> Attachments: AVRO-1783.patch, AVRO-1783.stack.text
>
>
> In the [vote thread for Avro 
> 1.8.0-rc2|http://mail-archives.apache.org/mod_mbox/avro-dev/201601.mbox/%3CCAGHyZ6K-oe35%2BOYROK6MSwrHxfPHvjmqhJAfRJL2dzexYw6YSw%40mail.gmail.com%3E],
>  [~busbey] noticed that [phunt's 
> avro-rpc-quickstart|https://github.com/phunt/avro-rpc-quickstart] fails:
> {code}
> busbey$ ruby sample_ipc_client.rb avro_user pat Hello_World
> Avro::IO::AvroTypeError: The datum
> "\x89\xA9\xD1\xFF@NUm\xEA\x9A\xFB\xDAx\xF5Zq"
> is not an example of schema
> {"type":"fixed","name":"MD5","namespace":"org.apache.avro.ipc","size":16}
>   write_data at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:543
> write_record at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:610
> each at org/jruby/RubyArray.java:1613
> write_record at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:609
>   write_data at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:561
>write at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:538
>  write_handshake_request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:136
>  request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:105
>  request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:117
>   (root) at sample_ipc_client.rb:49
> {code}
> I tried reproducing the error, and it is quite strange. avro-rpc-quickstart 
> works fine for me in Ruby (MRI) 2.2 and 2.1, and in JRuby 1.7.23. However, 
> [~busbey] was using JRuby 1.7.3 (as visible from the path names above), and 
> in this particular version of JRuby I was able to reproduce the issue.
> It seems that in some circumstances (but not always, bizarrely), JRuby 1.7.3 
> returns a UTF-8 encoded string from {{Digest::MD5.digest}}, rather than a 
> binary-encoded string. {{Schema.validate}} checks that the string is suitable 
> for writing as datum for a {{fixed}} type by calling {{#size}}. In this case, 
> although the MD5 digest of the schema is a 16-byte string, if you interpret 
> it as a UTF-8 encoded string, it consists of only 13 characters (i.e. some 
> sequences are interpreted as multibyte characters).
> Rather than trying to divine why JRuby is being weird here, I think this is 
> an opportunity to fix Avro's handling of strings to make it robust against 
> unexpected encodings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1775) Running unit tests on Ruby 2.2

2016-01-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1775:
--
Assignee: Martin Kleppmann

> Running unit tests on Ruby 2.2
> --
>
> Key: AVRO-1775
> URL: https://issues.apache.org/jira/browse/AVRO-1775
> Project: Avro
>  Issue Type: Bug
>  Components: ruby
>Reporter: Martin Kleppmann
>Assignee: Martin Kleppmann
> Fix For: 1.8.0
>
> Attachments: AVRO-1775-1.patch
>
>
> Ruby 2.2 [removed the test/unit framework from the standard 
> library|https://bugs.ruby-lang.org/issues/9711#note-12]. As the Avro Ruby 
> implementation uses it for its tests, we need to add a dependency on the 
> {{test-unit}} gem in order to run the tests in Ruby 2.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1781) Schema.parse is not thread safe

2016-01-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107920#comment-15107920
 ] 

Sean Busbey commented on AVRO-1781:
---

{code}
import com.google.common.base.Optional;
import com.google.common.collect.MapMaker;
{code}

Both of these imports are no longer needed.

> Schema.parse is not thread safe
> ---
>
> Key: AVRO-1781
> URL: https://issues.apache.org/jira/browse/AVRO-1781
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Sean Busbey
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
> Attachments: AVRO-1781-ADDENDUM.1.patch, AVRO-1781.1.patch, 
> AVRO-1781.2.patch
>
>
> Post AVRO-1497, Schema.parse calls {{LogicalTypes.fromSchemaIgnoreInvalid}} 
> on any schema that is expressed as a JSON object (anything except bare 
> primitives).
> That static method relies on a static cache based on WeakIdentityHashMap 
> (WIHM).
> WIHM clearly states that it isn't threadsafe 
> [ref|https://github.com/apache/avro/blob/branch-1.8/lang/java/avro/src/main/java/org/apache/avro/util/WeakIdentityHashMap.java#L42]
> {code}
>  * 
>  * Note that this implementation is not synchronized.
>  * 
>  */
> public class WeakIdentityHashMap implements Map {
> {code}
> All of the Schema.Parser instances use that same static Schema.parse method.
> The end result is that as-is it's only safe to have a single thread parsing 
> schemas in a given JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1781) Schema.parse is not thread safe

2016-01-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107915#comment-15107915
 ] 

Sean Busbey commented on AVRO-1781:
---

is the addendum meant to be applied after reverting the previous patch?

> Schema.parse is not thread safe
> ---
>
> Key: AVRO-1781
> URL: https://issues.apache.org/jira/browse/AVRO-1781
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Sean Busbey
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
> Attachments: AVRO-1781-ADDENDUM.1.patch, AVRO-1781.1.patch, 
> AVRO-1781.2.patch
>
>
> Post AVRO-1497, Schema.parse calls {{LogicalTypes.fromSchemaIgnoreInvalid}} 
> on any schema that is expressed as a JSON object (anything except bare 
> primitives).
> That static method relies on a static cache based on WeakIdentityHashMap 
> (WIHM).
> WIHM clearly states that it isn't threadsafe 
> [ref|https://github.com/apache/avro/blob/branch-1.8/lang/java/avro/src/main/java/org/apache/avro/util/WeakIdentityHashMap.java#L42]
> {code}
>  * 
>  * Note that this implementation is not synchronized.
>  * 
>  */
> public class WeakIdentityHashMap implements Map {
> {code}
> All of the Schema.Parser instances use that same static Schema.parse method.
> The end result is that as-is it's only safe to have a single thread parsing 
> schemas in a given JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: buildbot failure in ASF Buildbot on avro-c-ubuntu

2016-01-19 Thread Sean Busbey
could we move to do these buildbot builds via docker instead?

On Tue, Jan 19, 2016 at 4:01 PM, Doug Cutting  wrote:

> According to:
>
>   https://ci.apache.org/buildbot.html
>
> you should send a message to bui...@apache.org or file a Jira.
>
> Or you might ask if anyone in the infra HipChat channel can do it:
>
> https://s.apache.org/infrachat
>
> Doug
>
> On Tue, Jan 19, 2016 at 1:45 PM, Martin Kleppmann 
> wrote:
> > Looks like this CI box doesn't have libjansson installed. Does someone
> have the magic touch to install it please?
> >
> >> On 19 Jan 2016, at 20:54, build...@apache.org wrote:
> >>
> >> The Buildbot has detected a new failure on builder avro-c-ubuntu while
> building ASF Buildbot. Full details are available at:
> >>http://ci.apache.org/builders/avro-c-ubuntu/builds/17
> >>
> >> Buildbot URL: http://ci.apache.org/
> >>
> >> Buildslave for this Build: bb-vm_ubuntu
> >>
> >> Build Reason: The AnyBranchScheduler scheduler named 'AvroC' triggered
> this build
> >> Build Source Stamp: [branch avro/trunk] 1725610
> >> Blamelist: martinkl,tomwhite
> >>
> >> BUILD FAILED: failed configure
> >>
> >> Sincerely,
> >> -The Buildbot
> >>
> >>
> >>
> >
>



-- 
Sean


[jira] [Commented] (AVRO-1559) Drop support for Ruby 1.8

2016-01-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107863#comment-15107863
 ] 

Sean Busbey commented on AVRO-1559:
---

I'm +1 on dropping 1.8, provided we have a prominent release note and we're 
fine with the possibility of more 1.7 releases should requests come in.

I'd still like to avoid expressly removing 1.9 as well. I'd prefer to build up 
2.2+ support in a different gem.

> Drop support for Ruby 1.8
> -
>
> Key: AVRO-1559
> URL: https://issues.apache.org/jira/browse/AVRO-1559
> Project: Avro
>  Issue Type: Wish
>Affects Versions: 1.7.7
>Reporter: Willem van Bergen
>Assignee: Willem van Bergen
> Fix For: 1.8.0
>
> Attachments: AVRO-1559.patch
>
>
> - Ruby 1.8 is EOL, and is even security issues aren't addressed anymore. 
> - It is also getting hard to set up Ruby 1.8 to run the tests (e.g. on a 
> recent OSX, it won't compile without manual fiddling).
> - Handling character encodings in Ruby 1.9 is very different than Ruby 1.8. 
> Supporting both at the same time adds a lot of overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1783) Gracefully handle strings with wrong character encoding

2016-01-11 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092834#comment-15092834
 ] 

Sean Busbey commented on AVRO-1783:
---

good digging! also, I should update my jruby version. ;)

> Gracefully handle strings with wrong character encoding
> ---
>
> Key: AVRO-1783
> URL: https://issues.apache.org/jira/browse/AVRO-1783
> Project: Avro
>  Issue Type: Bug
>  Components: ruby
>Affects Versions: 1.7.7
>Reporter: Martin Kleppmann
>
> In the [vote thread for Avro 
> 1.8.0-rc2|http://mail-archives.apache.org/mod_mbox/avro-dev/201601.mbox/%3CCAGHyZ6K-oe35%2BOYROK6MSwrHxfPHvjmqhJAfRJL2dzexYw6YSw%40mail.gmail.com%3E],
>  [~busbey] noticed that [phunt's 
> avro-rpc-quickstart|https://github.com/phunt/avro-rpc-quickstart] fails:
> {code}
> busbey$ ruby sample_ipc_client.rb avro_user pat Hello_World
> Avro::IO::AvroTypeError: The datum
> "\x89\xA9\xD1\xFF@NUm\xEA\x9A\xFB\xDAx\xF5Zq"
> is not an example of schema
> {"type":"fixed","name":"MD5","namespace":"org.apache.avro.ipc","size":16}
>   write_data at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:543
> write_record at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:610
> each at org/jruby/RubyArray.java:1613
> write_record at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:609
>   write_data at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:561
>write at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:538
>  write_handshake_request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:136
>  request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:105
>  request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:117
>   (root) at sample_ipc_client.rb:49
> {code}
> I tried reproducing the error, and it is quite strange. avro-rpc-quickstart 
> works fine for me in Ruby (MRI) 2.2 and 2.1, and in JRuby 1.7.23. However, 
> [~busbey] was using JRuby 1.7.3 (as visible from the path names above), and 
> in this particular version of JRuby I was able to reproduce the issue.
> It seems that in some circumstances (but not always, bizarrely), JRuby 1.7.3 
> returns a UTF-8 encoded string from {{Digest::MD5.digest}}, rather than a 
> binary-encoded string. {{Schema.validate}} checks that the string is suitable 
> for writing as datum for a {{fixed}} type by calling {{#size}}. In this case, 
> although the MD5 digest of the schema is a 16-byte string, if you interpret 
> it as a UTF-8 encoded string, it consists of only 13 characters (i.e. some 
> sequences are interpreted as multibyte characters).
> Rather than trying to divine why JRuby is being weird here, I think this is 
> an opportunity to fix Avro's handling of strings to make it robust against 
> unexpected encodings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1781) Schema.parse is not thread safe

2016-01-11 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092283#comment-15092283
 ] 

Sean Busbey commented on AVRO-1781:
---

2 main approaches I see:

* externalize the cache and store one per Schema.Parser instance, so you get 
caching if you reuse and the app can decide on parallelism trade-offs
* switch to using a R/W lock based cache so that it is threadsafe.

> Schema.parse is not thread safe
> ---
>
> Key: AVRO-1781
> URL: https://issues.apache.org/jira/browse/AVRO-1781
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Sean Busbey
>Priority: Blocker
> Fix For: 1.8.0
>
>
> Post AVRO-1497, Schema.parse calls {{LogicalTypes.fromSchemaIgnoreInvalid}} 
> on any schema that is expressed as a JSON object (anything except bare 
> primitives).
> That static method relies on a static cache based on WeakIdentityHashMap 
> (WIHM).
> WIHM clearly states that it isn't threadsafe 
> [ref|https://github.com/apache/avro/blob/branch-1.8/lang/java/avro/src/main/java/org/apache/avro/util/WeakIdentityHashMap.java#L42]
> {code}
>  * 
>  * Note that this implementation is not synchronized.
>  * 
>  */
> public class WeakIdentityHashMap implements Map {
> {code}
> All of the Schema.Parser instances use that same static Schema.parse method.
> The end result is that as-is it's only safe to have a single thread parsing 
> schemas in a given JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Avro release 1.8.0 (rc2)

2016-01-11 Thread Sean Busbey
* docs LICENSE : AVRO-1779
* NPE in avro-tools : AVRO-1780

I haven't gotten to verify the avro-rpc-quickstart problem yet, but there
is a concurrency bug in Schema.parse for the java library. Filed as blocker
under AVRO-1781.

On Sat, Jan 9, 2016 at 12:51 AM, Sean Busbey  wrote:

> -1 (non-binding)
>
> bad:
> * the artifact avro-doc-1.8.0.tar.gz has no LICENSE/NOTICE files
> * the avro-tools jar fails with NPE[1]
>
> I'll file issues.
>
> mixed:
>
> * ruby successfully loads and I can run some simple IO, but phunt's
> avro-rpc-quickstart fails[2]. I'm not sure yet if the failure is expected
> given the changes in 1.8.0.
>
> good:
> * other LICENSE/NOTICE spot-check looks fine
> * signatures match
> * checksums match
> * tag matches src tarball, ignoring one nit
>
> Only in avro-src/avro-src-1.8.0/lang/java/mapred/src/test/
> resources/org/apache/avro/mapreduce/mapreduce-test-input.avro: SUCCESS.crc
>
> [1]:
>
> busbey$ java -jar avro-tools-1.8.0.jar
> Version 1.8.0 of Exception in thread "main" java.lang.NullPointerException
> at org.apache.avro.tool.Main.printStream(Main.java:105)
> at org.apache.avro.tool.Main.run(Main.java:92)
> at org.apache.avro.tool.Main.main(Main.java:74)
> busbey$ java -jar avro-tools-1.8.0.jar --help
> Version 1.8.0 of Exception in thread "main" java.lang.NullPointerException
> at org.apache.avro.tool.Main.printStream(Main.java:105)
> at org.apache.avro.tool.Main.run(Main.java:92)
> at org.apache.avro.tool.Main.main(Main.java:74)
>
>   looks like the tooling looks for NOTICE.txt at the top level and we
> don't have one there anymore :(
>
>
> https://github.com/apache/avro/blob/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/Main.java#L92
>
>
> [2]: https://github.com/phunt/avro-rpc-quickstart
>
> busbey$ ruby sample_ipc_client.rb avro_user pat Hello_World
> Avro::IO::AvroTypeError: The datum 
> "\x89\xA9\xD1\xFF@NUm\xEA\x9A\xFB\xDAx\xF5Zq"
> is not an example of schema
> {"type":"fixed","name":"MD5","namespace":"org.apache.avro.ipc","size":16}
>write_data at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:543
>  write_record at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:610
>  each at org/jruby/RubyArray.java:1613
>  write_record at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:609
>write_data at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:561
> write at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:538
>   write_handshake_request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:136
>   request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:105
>   request at
> /Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:117
>(root) at sample_ipc_client.rb:49
>
>
> On Wed, Jan 6, 2016 at 9:16 AM, Tom White  wrote:
>
>> I've created another release candidate for Avro 1.8.0 that fixes the
>> problems found in rc1. (Thanks to Ryan and Martin for the patches to fix
>> them.)
>>
>> The changes are listed at:
>> http://s.apache.org/avro180
>>
>> The release artifacts can be found here:
>> https://dist.apache.org/repos/dist/dev/avro/avro-1.8.0-rc2/
>>
>> The tag corresponding to this release candidate is:
>> http://svn.apache.org/repos/asf/avro/tags/release-1.8.0-rc2/
>>
>> You can find the KEYS file here:
>> https://dist.apache.org/repos/dist/release/avro/KEYS
>>
>> The Maven staging repository is at:
>> https://repository.apache.org/content/repositories/orgapacheavro-1004/
>>
>> Please download, verify, and test. The vote will remain open for at least
>> 72 hours. Again, this is the first release that has been built using
>> Docker, so please pay extra attention to the languages you are interested
>> in. Thanks in advance for voting!
>>
>> Cheers,
>> Tom
>>
>
>
>
> --
> Sean
>



-- 
Sean


[jira] [Updated] (AVRO-1781) Schema.parse is not thread safe

2016-01-11 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1781:
--
Summary: Schema.parse is not thread safe  (was: Schema.)

> Schema.parse is not thread safe
> ---
>
> Key: AVRO-1781
> URL: https://issues.apache.org/jira/browse/AVRO-1781
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>    Reporter: Sean Busbey
>Priority: Blocker
> Fix For: 1.8.0
>
>
> Post AVRO-1497, Schema.parse calls {{LogicalTypes.fromSchemaIgnoreInvalid}} 
> on any schema that is expressed as a JSON object (anything except bare 
> primitives).
> That static method relies on a static cache based on WeakIdentityHashMap 
> (WIHM).
> WIHM clearly states that it isn't threadsafe 
> [ref|https://github.com/apache/avro/blob/branch-1.8/lang/java/avro/src/main/java/org/apache/avro/util/WeakIdentityHashMap.java#L42]
> {code}
>  * 
>  * Note that this implementation is not synchronized.
>  * 
>  */
> public class WeakIdentityHashMap implements Map {
> {code}
> All of the Schema.Parser instances use that same static Schema.parse method.
> The end result is that as-is it's only safe to have a single thread parsing 
> schemas in a given JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1781) Schema.

2016-01-11 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1781:
-

 Summary: Schema.
 Key: AVRO-1781
 URL: https://issues.apache.org/jira/browse/AVRO-1781
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Sean Busbey
Priority: Blocker
 Fix For: 1.8.0


Post AVRO-1497, Schema.parse calls {{LogicalTypes.fromSchemaIgnoreInvalid}} on 
any schema that is expressed as a JSON object (anything except bare primitives).

That static method relies on a static cache based on WeakIdentityHashMap (WIHM).

WIHM clearly states that it isn't threadsafe 
[ref|https://github.com/apache/avro/blob/branch-1.8/lang/java/avro/src/main/java/org/apache/avro/util/WeakIdentityHashMap.java#L42]

{code}
 * 
 * Note that this implementation is not synchronized.
 * 
 */
public class WeakIdentityHashMap implements Map {
{code}

All of the Schema.Parser instances use that same static Schema.parse method.

The end result is that as-is it's only safe to have a single thread parsing 
schemas in a given JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1779) Avro docs convenience artifact missing LICENSE/NOTICE

2016-01-11 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1779:
-

 Summary: Avro docs convenience artifact missing LICENSE/NOTICE
 Key: AVRO-1779
 URL: https://issues.apache.org/jira/browse/AVRO-1779
 Project: Avro
  Issue Type: Bug
  Components: doc
Affects Versions: 1.8.0
Reporter: Sean Busbey
Priority: Blocker
 Fix For: 1.8.0


for releases we generate a convenience artifact with our docs. at present, this 
tarball is missing our needed LICENSE/NOTICE files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1780) Avro tools jar fails with NPE

2016-01-11 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1780:
-

 Summary: Avro tools jar fails with NPE
 Key: AVRO-1780
 URL: https://issues.apache.org/jira/browse/AVRO-1780
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Sean Busbey
Priority: Blocker
 Fix For: 1.8.0


following our license/notice updates, teh avro-tools jar fails with a NPE 
because it wants to print out a NOTICE.txt in the root of the jar.

{code}
busbey$ java -jar avro-tools-1.8.0.jar
Version 1.8.0 of Exception in thread "main" java.lang.NullPointerException
at org.apache.avro.tool.Main.printStream(Main.java:105)
at org.apache.avro.tool.Main.run(Main.java:92)
at org.apache.avro.tool.Main.main(Main.java:74)
busbey$ java -jar avro-tools-1.8.0.jar --help
Version 1.8.0 of Exception in thread "main" java.lang.NullPointerException
at org.apache.avro.tool.Main.printStream(Main.java:105)
at org.apache.avro.tool.Main.run(Main.java:92)
at org.apache.avro.tool.Main.main(Main.java:74)

{code}

We should probably not print the entire NOTICE unless a cli arg is given, since 
it is much bigger now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Avro release 1.8.0 (rc2)

2016-01-08 Thread Sean Busbey
-1 (non-binding)

bad:
* the artifact avro-doc-1.8.0.tar.gz has no LICENSE/NOTICE files
* the avro-tools jar fails with NPE[1]

I'll file issues.

mixed:

* ruby successfully loads and I can run some simple IO, but phunt's
avro-rpc-quickstart fails[2]. I'm not sure yet if the failure is expected
given the changes in 1.8.0.

good:
* other LICENSE/NOTICE spot-check looks fine
* signatures match
* checksums match
* tag matches src tarball, ignoring one nit

Only in avro-src/avro-src-1.8.0/lang/java/mapred/src/test/
resources/org/apache/avro/mapreduce/mapreduce-test-input.avro: SUCCESS.crc

[1]:

busbey$ java -jar avro-tools-1.8.0.jar
Version 1.8.0 of Exception in thread "main" java.lang.NullPointerException
at org.apache.avro.tool.Main.printStream(Main.java:105)
at org.apache.avro.tool.Main.run(Main.java:92)
at org.apache.avro.tool.Main.main(Main.java:74)
busbey$ java -jar avro-tools-1.8.0.jar --help
Version 1.8.0 of Exception in thread "main" java.lang.NullPointerException
at org.apache.avro.tool.Main.printStream(Main.java:105)
at org.apache.avro.tool.Main.run(Main.java:92)
at org.apache.avro.tool.Main.main(Main.java:74)

  looks like the tooling looks for NOTICE.txt at the top level and we don't
have one there anymore :(

https://github.com/apache/avro/blob/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/Main.java#L92


[2]: https://github.com/phunt/avro-rpc-quickstart

busbey$ ruby sample_ipc_client.rb avro_user pat Hello_World
Avro::IO::AvroTypeError: The datum
"\x89\xA9\xD1\xFF@NUm\xEA\x9A\xFB\xDAx\xF5Zq"
is not an example of schema
{"type":"fixed","name":"MD5","namespace":"org.apache.avro.ipc","size":16}
   write_data at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:543
 write_record at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:610
 each at org/jruby/RubyArray.java:1613
 write_record at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:609
   write_data at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:561
write at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/io.rb:538
  write_handshake_request at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:136
  request at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:105
  request at
/Users/busbey/.rvm/gems/jruby-1.7.3/gems/avro-1.8.0/lib/avro/ipc.rb:117
   (root) at sample_ipc_client.rb:49


On Wed, Jan 6, 2016 at 9:16 AM, Tom White  wrote:

> I've created another release candidate for Avro 1.8.0 that fixes the
> problems found in rc1. (Thanks to Ryan and Martin for the patches to fix
> them.)
>
> The changes are listed at:
> http://s.apache.org/avro180
>
> The release artifacts can be found here:
> https://dist.apache.org/repos/dist/dev/avro/avro-1.8.0-rc2/
>
> The tag corresponding to this release candidate is:
> http://svn.apache.org/repos/asf/avro/tags/release-1.8.0-rc2/
>
> You can find the KEYS file here:
> https://dist.apache.org/repos/dist/release/avro/KEYS
>
> The Maven staging repository is at:
> https://repository.apache.org/content/repositories/orgapacheavro-1004/
>
> Please download, verify, and test. The vote will remain open for at least
> 72 hours. Again, this is the first release that has been built using
> Docker, so please pay extra attention to the languages you are interested
> in. Thanks in advance for voting!
>
> Cheers,
> Tom
>



-- 
Sean


[jira] [Commented] (AVRO-1738) add java tool for outputting schema fingerprints

2016-01-04 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081556#comment-15081556
 ] 

Sean Busbey commented on AVRO-1738:
---

yeah, that'd be great.

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS][JAVA] Generating toBytes/fromBytes methods?

2015-12-22 Thread Sean Busbey
Including a schema fingerprint at the start

1) reuses stuff we have
2) gives a language independent notion of compatibility
3) doesn't bind how folks get stuff in/out of the single record form.

-- 
Sean Busbey
On Dec 22, 2015 06:52, "Niels Basjes"  wrote:

> I was not clear enough in my previous email.
> What I meant is to 'wrap' the application schema in a serialization wrapper
> schema that has a field indicating the "schema classname".
> That (generic setup) combined with some generated code in the schema
> classes should yield a solution that supports schema migration.
>
> Niels
>
> On Tue, Dec 22, 2015 at 11:55 AM, Niels Basjes  wrote:
>
> > Thanks for pointing this out.
> > This is exactly what I was working on.
> >
> > The way I solved the 'does the schema match' question at work is by
> > requiring that all schema's start with a single text field "schema
> > classname" being the full class name of the class that was used to
> generate
> > it.
> > That way we can have newer versions of the schema and still be able to
> > unpack them. In this form the classname is essentially an indicator if
> > schema migration is possible; even though the schemas are different.
> >
> > What do you think of this direction?
> >
> > Niels
> >
> >
> > On Mon, Dec 21, 2015 at 11:30 PM, Ryan Blue  wrote:
> >
> >> Niels,
> >>
> >> This sounds like a good idea to me to have methods like this. I've had
> to
> >> write those methods several times!
> >>
> >> The idea is also related to AVRO-1704 [1], which is a suggestion to
> >> standardize the encoding that is used for single records. Some projects
> >> have been embedding the schema fingerprint at the start of each record,
> for
> >> example, which would be a helpful thing to do.
> >>
> >> It may also be a good idea to create a helper object rather than
> >> attaching new methods to the datum classes themselves. In your example
> >> below, you have to create a new encoder or decoder for each method
> call. We
> >> could instead keep a backing buffer and encoder/decoder on a class that
> the
> >> caller instantiates so that they can be reused. At the same time, that
> >> would make it possible to reuse the class with any data model and manage
> >> the available schemas (if embedding the fingerprint).
> >>
> >> I'm thinking something like this:
> >>
> >>   ReflectClass datum = new ReflectClass();
> >>   ReflectData model = ReflectData.get();
> >>   DatumCodec codec = new DatumCodec(model, schema);
> >>
> >>   # convert datum to bytes using data model
> >>   byte[] asBytes = codec.toBytes(datum);
> >>
> >>   # convert bytes to datum using data model
> >>   ReflectClass copy = codec.fromBytes(asBytes);
> >>
> >> What do you think?
> >>
> >> rb
> >>
> >>
> >> [1]: https://issues.apache.org/jira/browse/AVRO-1704
> >>
> >>
> >> On 12/18/2015 05:01 AM, Niels Basjes wrote:
> >>
> >>> Hi,
> >>>
> >>> I'm working on a project where I'm putting Avro records into Kafka and
> at
> >>> the other end pull them out again.
> >>> For that purpose I wrote two methods 'toBytes' and 'fromBytes' in a
> >>> separate class (see below).
> >>>
> >>> I see this as the type of problem many developers run into.
> >>> Would it be a good idea to generate methods like these into the
> generated
> >>> Java code?
> >>>
> >>> This would make it possible to serialize and deserialize singles
> records
> >>> like this:
> >>>
> >>> byte [] someBytes = measurement.toBytes();
> >>> Measurement m = Measurement.fromBytes(someBytes);
> >>>
> >>> Niels Basjes
> >>>
> >>> P.S. possibly not name it toBytes but getBytes (similar to what the
> >>> String
> >>> class has)
> >>>
> >>> public final class MeasurementSerializer {
> >>>  private MeasurementSerializer() {
> >>>  }
> >>>
> >>>  public static Measurement fromBytes(byte[] bytes) throws
> >>> IOException {
> >>>  try {
> >>>  DatumReader reader = new
> >>> SpecificDatumReader<>(Measurement.getClassSchema());
> >>>   

Re: [DISCUSS] Migrate to Java 7

2015-12-20 Thread Sean Busbey
Cross compiling from jdk8 to jre7 is easy to do incorrectly, such that you
end up with bytecode with the right class version number but e.g. incorrect
assumptions about SDK APIs.

If we're aiming to support jre7, I'd rather we compile with jdk7 until the
simplified cross compiling support lands in jdk9. (Presuming it still does.)

-- 
Sean Busbey
On Dec 19, 2015 4:54 AM, "Niels Basjes"  wrote:

> Ah, yes I understand what you mean.
> I mean to say something slightly different.
> 1) Keep/Make the source Java 1.7 compliant.
> 2) Compile using the newest compiler (i.e. 1.8)
> 3) Compile towards the binary 1.7 compliant.
>
> That way we're taking advantage of the best compiler yet with enough
> backward compatibility.
> This would also avoid the 1.8 bytecode problems.
>
> Niels Basjes
>
>
>
> On Fri, Dec 18, 2015 at 5:59 PM, Ryan Blue  wrote:
>
> > I would agree, but unfortunately Java 8 features required bytecode
> changes
> > and Java 8 can't be compiled to target Java 7. There is a good summary of
> > it a few answers down on this SO question:
> >
> >
> >
> >
> https://stackoverflow.com/questions/16143684/can-java-8-code-be-compiled-to-run-on-java-7-jvm
> >
> > I think that means we should stay on Java 7 as long as possible. I
> propose
> > taking a look at this is 6 months or so to see how many people are still
> > running Java 7 or have moved to 8.
> >
> > rb
> >
> >
> > On 12/18/2015 05:03 AM, Niels Basjes wrote:
> >
> >> Ryan,
> >>
> >> Perhaps we should even make the step to build using Java 8, yet generate
> >> bytecode at the Java 7 level.
> >>
> >> Niels
> >>
> >> On Fri, Dec 18, 2015 at 1:25 PM, Niels Basjes  wrote:
> >>
> >> +1
> >>> On 18 Dec 2015 12:33, "Tom White"  wrote:
> >>>
> >>> +1
> >>>>
> >>>> Tom
> >>>>
> >>>> On Tue, Dec 15, 2015 at 1:08 AM, Ryan Blue  wrote:
> >>>>
> >>>> I just noticed that our tests are still compiling and running with
> Java
> >>>>>
> >>>> 6.
> >>>>
> >>>>> Java 7 is already end-of-life (public patches at least), so I think
> it
> >>>>>
> >>>> is
> >>>>
> >>>>> reasonable to start migrating. Is everyone okay with updating the
> >>>>> builds
> >>>>> and dropping support for Java 6?
> >>>>>
> >>>>> rb
> >>>>>
> >>>>> --
> >>>>> Ryan Blue
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Cloudera, Inc.
> >
>
>
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>


Re: [VOTE] Avro release 1.8.0 (rc1)

2015-12-19 Thread Sean Busbey
-1 (non-binding)


* the artifact avro-doc-1.8.0.tar.gz has no LICENSE/NOTICE files
* I can confirm the ruby error Martin ran into.
* nit, the source tag does not match the source artifact
avro-1.8.0-rc1 busbey$ diff -r svn_1.8.0_rc1_tag avro-src/avro-src-1.8.0/ |
grep -v "\.svn"
Only in
avro-src/avro-src-1.8.0/lang/java/mapred/src/test/resources/org/apache/avro/mapreduce/mapreduce-test-input.avro:
SUCCESS.crc
diff -r svn_1.8.0_rc1_tag/lang/py3/README.txt
avro-src/avro-src-1.8.0/lang/py3/README.txt
10,13d9
<
< For LICENSE and NOTICE information for the python3 implementation, see:
< * avro/LICENSE
< * avro/NOTICE


* signatures match
* checksums match



On Wed, Dec 16, 2015 at 11:55 AM, Tom White  wrote:

> I have created a new candidate build for Avro release 1.8.0 following the
> rc0 vote in August that didn't pass due to licensing/notice issues. (Thanks
> to Ryan and Sean for fixing them!)
>
> The changes are listed at:
> http://s.apache.org/avro180
>
> The release artifacts can be found here:
> *https://dist.apache.org/repos/dist/dev/avro/avro-1.8.0-rc1/
> *
>
> The tag corresponding to this release candidate is
> *http://svn.apache.org/repos/asf/avro/tags/release-1.8.0-rc1/
> *
>
> You can find the KEYS file here:
> https://dist.apache.org/repos/dist/release/avro/KEYS
>
> The Maven staging repository is at:
> *https://repository.apache.org/content/repositories/orgapacheavro-1003
> *
>
> Please download, verify, and test. This is the first release that has been
> built using Docker, so please pay extra attention to the languages you are
> interested in. Thanks in advance for voting!
>
> Cheers,
> Tom
>



-- 
Sean


Re: [DISCUSS] Migrate to Java 7

2015-12-18 Thread Sean Busbey
this would be after 1.8, right?

sounds reasonable, but we should send a [DISCUSS] for impact to user@avro
after the new year to gauge impact on downstream before we decide.

On Mon, Dec 14, 2015 at 7:08 PM, Ryan Blue  wrote:

> I just noticed that our tests are still compiling and running with Java 6.
> Java 7 is already end-of-life (public patches at least), so I think it is
> reasonable to start migrating. Is everyone okay with updating the builds
> and dropping support for Java 6?
>
> rb
>
> --
> Ryan Blue
>



-- 
Sean


Re: [VOTE] Avro release 1.8.0 (rc1)

2015-12-17 Thread Sean Busbey
what's the voting period on this RC?

On Wed, Dec 16, 2015 at 11:55 AM, Tom White  wrote:

> I have created a new candidate build for Avro release 1.8.0 following the
> rc0 vote in August that didn't pass due to licensing/notice issues. (Thanks
> to Ryan and Sean for fixing them!)
>
> The changes are listed at:
> http://s.apache.org/avro180
>
> The release artifacts can be found here:
> *https://dist.apache.org/repos/dist/dev/avro/avro-1.8.0-rc1/
> *
>
> The tag corresponding to this release candidate is
> *http://svn.apache.org/repos/asf/avro/tags/release-1.8.0-rc1/
> *
>
> You can find the KEYS file here:
> https://dist.apache.org/repos/dist/release/avro/KEYS
>
> The Maven staging repository is at:
> *https://repository.apache.org/content/repositories/orgapacheavro-1003
> *
>
> Please download, verify, and test. This is the first release that has been
> built using Docker, so please pay extra attention to the languages you are
> interested in. Thanks in advance for voting!
>
> Cheers,
> Tom
>



-- 
Sean


[jira] [Updated] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-12-15 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1728:
--
Attachment: AVRO-1728.4.patch

-04 Updated Ryan's patch to correct a few last things

 - Always correct the tools javadoc LICENSE/NOTICE, rather than just in the 
apache-release profile
 - use LICENSE/NOTICE for tools shaded binary
 - excluded an additional copy of the ASLv2 brought in by a shaded dependency 
in the tools shaded binary
 - use LICENSE.txt/NOTICE.txt for tools source jar and non-shaded binary
 - fix tools test-jar to use plain LICENSE/NOTICE
 - stripped trailing whitespace

We publish the 'nodeps' classifier with a binary jar, which made the "included 
in the binary" language ambiguous. I fixed the test-jar while I was there, but 
it looks like we don't publish it for the tools jar.

ASF policy allows for having either .txt or bare files and breaking things out 
like this makes sure we are in line with the guidance to only include 
LICENSE/NOTICE information about the contents of the actual artifact.

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch, AVRO-1728.2.patch, AVRO-1728.3.patch, 
> AVRO-1728.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-12-15 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058154#comment-15058154
 ] 

Sean Busbey commented on AVRO-1728:
---

I noticed on tools your javadoc LICENSE/NOTICE changes only happen on the 
apache-release profile. At first I was testing with the "dist" profile defined 
in the top-level. Could we attach those changes whenever a javadoc artifact is 
made?

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch, AVRO-1728.2.patch, AVRO-1728.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-12-15 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058075#comment-15058075
 ] 

Sean Busbey commented on AVRO-1728:
---

I'm reviewing now. I think we can tease apart the source/bin differences, let 
me give it a shot.

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch, AVRO-1728.2.patch, AVRO-1728.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-12-03 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15038296#comment-15038296
 ] 

Sean Busbey commented on AVRO-1722:
---

+1 for addendum. nit: I'd like it better if the commit message start included 
"AVRO-1722 ADDENDUM"

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
> Attachments: AVRO-1722.2.patch
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1641) parser.java stack should expand quickly up to some threshold rather than start at the threshold

2015-11-23 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022895#comment-15022895
 ] 

Sean Busbey commented on AVRO-1641:
---

FYI, I started prepping to put this in, but my first pass at a simple benchmark 
showed things getting worse. So I'll need to make something more formal to show 
the impact of hte change.

> parser.java stack should expand quickly up to some threshold rather than 
> start at the threshold
> ---
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Assignee: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1738) add java tool for outputting schema fingerprints

2015-11-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015097#comment-15015097
 ] 

Sean Busbey commented on AVRO-1738:
---

{quote}
So it looks like you don't have to put the Apache content in the LICENSE file, 
though it doesn't seem forbidden (like adding things to NOTICE that aren't 
legally required). Would it be better to keep a pointer for the inclusion in 
LICENSE or to leave it out?
{quote}

I'm happy to do whatever, so long as we're consistent on the project. 
Personally, I like to make sure we give credit where we're using someone else's 
work directly.

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>    Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1738) add java tool for outputting schema fingerprints

2015-11-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1738:
--
Status: Open  (was: Patch Available)

moving out of patch available until I can update this.

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>    Reporter: Sean Busbey
>    Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1641) parser.java stack should expand quickly up to some threshold rather than start at the threshold

2015-11-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1641:
--
Assignee: Zoltan Farkas

> parser.java stack should expand quickly up to some threshold rather than 
> start at the threshold
> ---
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Assignee: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1641) parser.java stack should expand quickly up to some threshold rather than start at the threshold

2015-11-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1641:
--
Status: Patch Available  (was: Open)

> parser.java stack should expand quickly up to some threshold rather than 
> start at the threshold
> ---
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1641) parser.java stack should expand quickly up to some threshold rather than start at the threshold

2015-11-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014233#comment-15014233
 ] 

Sean Busbey commented on AVRO-1641:
---

I suspect the limits of our current developer bandwidth is the thing preventing 
such folks from picking up these kinds of low hanging fruit and posting 
patches. It would definitely be nice if we could find the time to grab these 
kinds of issues.

> parser.java stack should expand quickly up to some threshold rather than 
> start at the threshold
> ---
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1641) parser.java stack should expand quickly up to some threshold rather than start at the threshold

2015-11-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1641:
--
Summary: parser.java stack should expand quickly up to some threshold 
rather than start at the threshold  (was: parser.java stack expansion probably 
not as intended)

> parser.java stack should expand quickly up to some threshold rather than 
> start at the threshold
> ---
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1641) parser.java stack expansion probably not as intended

2015-11-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013990#comment-15013990
 ] 

Sean Busbey commented on AVRO-1641:
---

{quote}
I will follow up with a patch... 
However it would be nice if we can be a bit more pragmatic in the future and 
reduce the contribution overhead for simple/obvious things...
{quote}

How can we lower the overhead from providing a patch or PR?

> parser.java stack expansion probably not as intended
> 
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1641) parser.java stack expansion probably not as intended

2015-11-19 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013568#comment-15013568
 ] 

Sean Busbey commented on AVRO-1641:
---

This sounds reasonable. If you can post a patch I'm happy to review. A 
benchmark of some kind showing the impact would help.

> parser.java stack expansion probably not as intended
> 
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Avro
>  Issue Type: Bug
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Priority: Minor
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> should probably be:
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: One liner improvement with significant benefits

2015-11-19 Thread Sean Busbey
I commented on the ticket. thanks for pointing it out!

On Thu, Nov 19, 2015 at 7:24 AM, Zoltan Farkas  wrote:

> Can somebody integrate this:
>
> https://issues.apache.org/jira/browse/AVRO-1641
>
> it is a small change with significant impact...
>
>
> —Z




-- 
Sean


[jira] [Updated] (AVRO-1691) Error parsing schema if schema is json string

2015-11-18 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1691:
--
Component/s: c

> Error parsing schema if schema is json string
> -
>
> Key: AVRO-1691
> URL: https://issues.apache.org/jira/browse/AVRO-1691
> Project: Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.7
>Reporter: Vikas Kumar
>
> I have a schema which looks like:
> "int"
> in file_read_header , I got an error: Cannot parse file header: Error parsing 
> JSON: '[' or '{' expected near '"int"'
> I see avro-c uses json_loadb from jansson which may be populating this error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1691) Error parsing schema if schema is json string

2015-11-18 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011230#comment-15011230
 ] 

Sean Busbey commented on AVRO-1691:
---

Care to have a go at updating?

> Error parsing schema if schema is json string
> -
>
> Key: AVRO-1691
> URL: https://issues.apache.org/jira/browse/AVRO-1691
> Project: Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.7
>Reporter: Vikas Kumar
>
> I have a schema which looks like:
> "int"
> in file_read_header , I got an error: Cannot parse file header: Error parsing 
> JSON: '[' or '{' expected near '"int"'
> I see avro-c uses json_loadb from jansson which may be populating this error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Release and code management

2015-11-05 Thread Sean Busbey
we are currently blocked on all releases because of licensing errors
in under-maintained libraries.

https://issues.apache.org/jira/browse/AVRO-1722

essentially Ryan and I slowly work our way through understanding each
code base enough to do an evaluation and update things.

It's been over 2 months now and it's a crappy situation to put our
contributors in.


On Thu, Nov 5, 2015 at 11:59 AM, Philip Zeyliger  wrote:
> I think it's always ok to re-release artifacts where nothing's changed.
> So, how can you be blocked on another language's implementation if you
> simply change the version number and re-release?
>
> -- Philip
>
> On Thu, Nov 5, 2015 at 9:43 AM Ryan Blue  wrote:
>
>> Phil or Sam, any ideas about how to keep release management simple, but
>> be able to avoid blocking specific languages on under-maintained ones?
>>
>> Also, looking at the release history we've had 3 releases in the last 2
>> years, and that's being generous to include 1.7.5 that was released in
>> August 2013. I don't think more release overhead would be that big of a
>> problem, and would be well worth keeping the languages that are well
>> maintained released and up-to-date.
>>
>> rb
>>
>> On 10/30/2015 09:37 AM, Ryan Blue wrote:
>> > I think Sean is right that we could continue to release several at once.
>> > We would almost certainly continue this practice for several languages
>> > that are mostly unmaintained (like perl and php). I also expect each
>> > language's release cadence to reflect the activity in that language,
>> > which I think is very important to maintain.
>> >
>> > I also don't want to underestimate the drawback of having a single
>> > version for multiple implementations. We can't use semantic verisoning
>> > for any of the implementations. If we bump the minor version (!) because
>> > of a breaking change in Java, but aren't making breaking changes to C,
>> > this is confusing to users.
>> >
>> > If we don't separate release vehicles, how can we improve version
>> > conventions?
>> >
>> > And how do we ensure timely releases that aren't blocked by other
>> > implementations? This affects how attractive this project is to new
>> > contributors. If the releases are seldom and contributions aren't
>> > available for months at a time, I think we have a problem.
>> >
>> > rb
>> >
>> > On 10/29/2015 04:51 PM, Philip Zeyliger wrote:
>> >> -0.
>> >>
>> >> If you divide the world into N releases, you'll end up having to do
>> >> release
>> >> management N times.  I think this will make doing releases that much
>> more
>> >> complicated, time-consuming, and error-prone.
>> >>
>> >> Note that you could separate release trains while remaining in a single
>> >> repo.  I'd certainly prefer that than separating into many smaller
>> repos.
>> >>
>> >> -- Philip
>> >>
>> >> On Thu, Oct 29, 2015 at 11:31 AM Ryan Blue  wrote:
>> >>
>> >>> On 10/29/2015 11:28 AM, Sean Busbey wrote:
>> >>>> On Oct 29, 2015 1:19 PM, "Ryan Blue"  wrote:
>> >>>>
>> >>>> Where would the language interop tests live if we don't break them
>> out?
>> >>>>
>> >>>> (We already have interop tests, in case that was lost in my original
>> >>> email.)
>> >>>
>> >>> We could either keep them where they are or add a separate repo.
>> Running
>> >>> them with a release candidate would have to be part of the release
>> >>> checks.
>> >>>
>> >>> rb
>> >>>
>> >>>
>> >>> --
>> >>> Ryan Blue
>> >>> Software Engineer
>> >>> Cloudera, Inc.
>> >>>
>> >>
>> >
>> >
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Cloudera, Inc.
>>



-- 
Sean


Re: [DISCUSS] Release and code management

2015-10-29 Thread Sean Busbey
Why can't we do a single release vote that covers multiple components?

-- 
Sean
On Oct 29, 2015 6:51 PM, "Philip Zeyliger"  wrote:

> -0.
>
> If you divide the world into N releases, you'll end up having to do release
> management N times.  I think this will make doing releases that much more
> complicated, time-consuming, and error-prone.
>
> Note that you could separate release trains while remaining in a single
> repo.  I'd certainly prefer that than separating into many smaller repos.
>
> -- Philip
>
> On Thu, Oct 29, 2015 at 11:31 AM Ryan Blue  wrote:
>
> > On 10/29/2015 11:28 AM, Sean Busbey wrote:
> > > On Oct 29, 2015 1:19 PM, "Ryan Blue"  wrote:
> > >
> > > Where would the language interop tests live if we don't break them out?
> > >
> > > (We already have interop tests, in case that was lost in my original
> > email.)
> >
> > We could either keep them where they are or add a separate repo. Running
> > them with a release candidate would have to be part of the release
> checks.
> >
> > rb
> >
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Cloudera, Inc.
> >
>


Re: [DISCUSS] Release and code management

2015-10-29 Thread Sean Busbey
On Oct 29, 2015 1:19 PM, "Ryan Blue"  wrote:
>
> Adding [DISCUSS] to the thread to make it more obvious.
>
> Yes, we could add a cross-language test suite in a separate repo. I think
this is out of scope for this discussion, but we definitely need to have
compat tests.
>

Where would the language interop tests live if we don't break them out?

(We already have interop tests, in case that was lost in my original email.)

-- 
Sean


Re: Release and code management

2015-10-29 Thread Sean Busbey
as an aside, it might be worth re-posting this with a subject that
starts with "[DISCUSS]" to flag the attention of lurkers.

On Thu, Oct 29, 2015 at 1:06 PM, Sean Busbey  wrote:
> Presumably this would allow us to make the cross-language tests their
> own module that the language ones could then use directly for testing?
>
> How do we track which format versions are supported by given language 
> versions?
>
> On Thu, Oct 29, 2015 at 12:18 PM, Ryan Blue  wrote:
>> Hi everyone,
>>
>> Right now we keep all of the language implementations in SVN together and
>> release everything in a single source release, which I think is getting a
>> little awkward for releases. I'd like to discuss the idea of separating some
>> of the languages out on their own and moving to Apache git servers instead
>> of SVN.
>>
>> The motivation for separating languages out is to allow quicker releases
>> that aren't blocked on problems in other languages. For example, we recently
>> found license documentation issues through most of the codebase. That's
>> currently blocking the global 1.8.0 release until we have time to figure out
>> how to fix the LICENSE and NOTICE included in each convenience binary
>> artifact. That, in turn, is blocking downstream projects like parquet-avro
>> that would like to depend on features in 1.8.0.
>>
>> We're also seeing an influx of new implementations: Microsoft has pinged the
>> issue to donate their C# implementation, Miki Tebeka is interested in
>> merging fastavro, and Matthieu Monsch has kindly offered a fast node-js
>> implementation as well. These are great for expanding the community and I
>> want to make sure these new projects aren't blocked when they are used to a
>> faster release cycle.
>>
>> I propose we allow implementations to use separate repositories, like
>> avro-python or avro-java, and to make separate releases. This would allow
>> some languages to have more agile release cycles and would allow us to
>> version APIs more effectively, using semver for each language and fixing
>> format compatibility at version 1.
>>
>> Thoughts and discussion?
>>
>> rb
>>
>>
>> --
>> Ryan Blue
>
>
>
> --
> Sean



-- 
Sean


Re: Release and code management

2015-10-29 Thread Sean Busbey
Presumably this would allow us to make the cross-language tests their
own module that the language ones could then use directly for testing?

How do we track which format versions are supported by given language versions?

On Thu, Oct 29, 2015 at 12:18 PM, Ryan Blue  wrote:
> Hi everyone,
>
> Right now we keep all of the language implementations in SVN together and
> release everything in a single source release, which I think is getting a
> little awkward for releases. I'd like to discuss the idea of separating some
> of the languages out on their own and moving to Apache git servers instead
> of SVN.
>
> The motivation for separating languages out is to allow quicker releases
> that aren't blocked on problems in other languages. For example, we recently
> found license documentation issues through most of the codebase. That's
> currently blocking the global 1.8.0 release until we have time to figure out
> how to fix the LICENSE and NOTICE included in each convenience binary
> artifact. That, in turn, is blocking downstream projects like parquet-avro
> that would like to depend on features in 1.8.0.
>
> We're also seeing an influx of new implementations: Microsoft has pinged the
> issue to donate their C# implementation, Miki Tebeka is interested in
> merging fastavro, and Matthieu Monsch has kindly offered a fast node-js
> implementation as well. These are great for expanding the community and I
> want to make sure these new projects aren't blocked when they are used to a
> faster release cycle.
>
> I propose we allow implementations to use separate repositories, like
> avro-python or avro-java, and to make separate releases. This would allow
> some languages to have more agile release cycles and would allow us to
> version APIs more effectively, using semver for each language and fixing
> format compatibility at version 1.
>
> Thoughts and discussion?
>
> rb
>
>
> --
> Ryan Blue



-- 
Sean


Re: Python Avro implementations

2015-10-29 Thread Sean Busbey
sounds great to me.

On Wed, Oct 28, 2015 at 1:14 PM, Ryan Blue  wrote:
> Hi everyone,
>
> Right now, we have two python implementations: py and py3. And there is also
> fastavro [1], which is popular because it is fast and more pythonic. It also
> works with python 2.7, python 3.x, pypy, and can be sped up by cython.
>
> I had a recent e-mail exchange with Miki Tebeka, the creator and maintainer
> of fastavro, about the current python Avro implementations and he's
> interested in working with the Apache community to merge the existing
> implementations into one. I'm really excited about it, since this is a great
> opportunity to grow the Avro community and consolidate the python
> implementations.
>
> I'd like to start a discussion from this thread about next steps. I think
> the best way forward is to bring fastavro in, and then work on building
> compatibility with the current APIs where we need to so that we can
> deprecate the existing py and py3 projects.
>
> Does that sound reasonable?
>
> rb
>
>
> [1]: https://github.com/tebeka/fastavro
>
> --
> Ryan Blue
> Software Engineer
> Cloudera, Inc.



-- 
Sean


Re: Interested in contributing to Avro C

2015-10-23 Thread Sean Busbey
Here are the existing docs on the C API:

http://avro.apache.org/docs/current/api/c/index.html

On Fri, Oct 23, 2015 at 11:08 AM, Nges Brian  wrote:
> Hi
> I will like to start by doing the documentation of the new C API
> Version.  As a project but first can someone help me links to
> Documents on the legacy Version?
> Thanks and Cheers.
>
> On 10/19/15, Nges Brian  wrote:
>> Once more I can find Documents on the Legacy Version as you rightly
>> said and I have found no Document  on The New Version. Can anyone help
>> me links to get material on the new version?
>> cheers
>>
>> On 10/15/15, Sean Busbey  wrote:
>>> Also worth noting, it's extremely valuable for the community if folks
>>> make new docs when they run into gaps.
>>>
>>> In this case, Nges, you could document the new API while you were out
>>> using
>>> it.
>>>
>>> On Thu, Oct 15, 2015 at 2:01 PM, Sam Groth 
>>> wrote:
>>>> Hi Nges,
>>>> I learned the C API by reading the code here:
>>>> https://github.com/apache/avro/tree/trunk/lang/c
>>>> I would also recommend setting up some application that writes and reads
>>>> avro files. Setting up simple applications using new libraries is one of
>>>> the best ways to learn about them.
>>>> One suggestion for getting started with the C API is to use the new
>>>> version of creating avro structs for serialization rather than the
>>>> legacy
>>>> version. Unfortunately most of the documentation that can be found is
>>>> for
>>>> the legacy version.
>>>>
>>>> Sam
>>>>
>>>>
>>>>  On Thursday, October 15, 2015 1:47 PM, Nges Brian
>>>>  wrote:
>>>>
>>>>
>>>>  Hi Avro Community.
>>>> I am new here and I am interested to contribute to open source project
>>>> and precisely Avro C.
>>>> I am in My second year in the university of Buea offering Computer A
>>>> Engineering in the Faculty of Engineering and Technology.
>>>> I have been coding in C ,c++. And I am good at C. I am interested in
>>>> The Avro C project so I am asking for help on some links so that I can
>>>> start gaining experience in the code base and improve on the C API.
>>>> Thanks and waiting to here from the community soon.
>>>>
>>>> On 10/14/15, Nges Brian  wrote:
>>>>> Hi Avro community ,
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Sean
>>>
>>



-- 
Sean


[jira] [Commented] (AVRO-1747) JavaScript implementation

2015-10-20 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965260#comment-14965260
 ] 

Sean Busbey commented on AVRO-1747:
---

You'll need to get things into a pull request (or patch) within the main Avro 
source tree. It looks like we have a javascript implementation, so we'll need 
to work out impact on existing users of that library.

You'll also have to be willing to donate the code to the Apache Software 
Foundation, which will then distribute it under the ASLv2. It looks like enough 
work has already happened that we'll have to go through the IP-clearance 
process. It's not a big deal, but will require a PMC member to drive the 
process.

> JavaScript implementation
> -
>
> Key: AVRO-1747
> URL: https://issues.apache.org/jira/browse/AVRO-1747
> Project: Avro
>  Issue Type: Improvement
>  Components: javascript
>Reporter: Matthieu Monsch
>
> Hello,
> I'm not sure if there is still interest in a JavaScript implementation of the 
> Avro spec, or if this is the right place for this message (apologies if not), 
> but in case it's useful here is one: https://github.com/mtth/avsc
> It's pretty fast, fully featured aside from protocols (AFAIK), and runs in 
> the browser.
> Disclaimer: I wrote this library. (I initially searched around for existing 
> implementations, and even saw a few tickets on this board about JavaScript 
> decoders, but couldn't find one to support the schemas I have to process.)
> Best,
> -Matthieu



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Interested in contributing to Avro C

2015-10-15 Thread Sean Busbey
Also worth noting, it's extremely valuable for the community if folks
make new docs when they run into gaps.

In this case, Nges, you could document the new API while you were out using it.

On Thu, Oct 15, 2015 at 2:01 PM, Sam Groth  wrote:
> Hi Nges,
> I learned the C API by reading the code here: 
> https://github.com/apache/avro/tree/trunk/lang/c
> I would also recommend setting up some application that writes and reads avro 
> files. Setting up simple applications using new libraries is one of the best 
> ways to learn about them.
> One suggestion for getting started with the C API is to use the new version 
> of creating avro structs for serialization rather than the legacy version. 
> Unfortunately most of the documentation that can be found is for the legacy 
> version.
>
> Sam
>
>
>  On Thursday, October 15, 2015 1:47 PM, Nges Brian  
> wrote:
>
>
>  Hi Avro Community.
> I am new here and I am interested to contribute to open source project
> and precisely Avro C.
> I am in My second year in the university of Buea offering Computer A
> Engineering in the Faculty of Engineering and Technology.
> I have been coding in C ,c++. And I am good at C. I am interested in
> The Avro C project so I am asking for help on some links so that I can
> start gaining experience in the code base and improve on the C API.
> Thanks and waiting to here from the community soon.
>
> On 10/14/15, Nges Brian  wrote:
>> Hi Avro community ,
>>
>
>
>



-- 
Sean


[jira] [Updated] (AVRO-1748) Add Snappy Compression to C++ DataFile

2015-10-14 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1748:
--
Fix Version/s: 1.8.0

> Add Snappy Compression to C++ DataFile
> --
>
> Key: AVRO-1748
> URL: https://issues.apache.org/jira/browse/AVRO-1748
> Project: Avro
>  Issue Type: New Feature
>  Components: c++
>Affects Versions: 1.7.7
>Reporter: J. Langley
> Fix For: 1.7.8, 1.8.0
>
>   Original Estimate: 40h
>  Remaining Estimate: 40h
>
> The C++ component of the Avro project should support Snappy compression as 
> the Java, C, and Python components do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: How to provide a patch?

2015-10-13 Thread Sean Busbey
Hi!

Glad to hear you'er interested in contributing back to the community.
Adding snappy to the C++ library sounds great.

Here's a link to the full guide on contributing:

https://cwiki.apache.org/confluence/display/AVRO/How+To+Contribute

It's pretty thorough, so here's a short version:

1) create an issue in our JIRA tracker:

http://issues.apache.org/jira/browse/AVRO

It should be listed as either 'improvement' or 'new feature.'

2) create a patch

You can do this off of subversion, as you mention, and attach to the jira
you made. Alternatively, if it's easier for you we also accept pull
requests against the github mirror of our project:

https://github.com/apache/avro

I look forward to helping review your contribution. Please let us know if
you have any other questions.

-- 
Sean
On Oct 10, 2015 12:04 PM,  wrote:

> Hi,
>
> I've done a bit of work that I'm using on a program where I have added
> Snappy compression support to the C++ DataFile class.  I also added a unit
> test to test the changes.
>
> I would like to make this available to the Avro community, but I don't
> know how to do that.  Also - I'm not familiar with the CMake process for
> adding Snappy as a dependency, so that part would probably need to be
> redone.
>
> Is this as easy as running some kind of patch script to create something
> from Subversion?
>
> Thanks,
> -J. Langley
>
>
>


[jira] [Commented] (AVRO-1733) Update LICENSE and NOTICE files included in C# release artifacts

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940806#comment-14940806
 ] 

Sean Busbey commented on AVRO-1733:
---

+1

fine to fix on commit:

{code}
diff --git a/lang/csharp/NOTICE b/lang/csharp/NOTICE
new file mode 100644
index 000..a396d4f
--- /dev/null
+++ b/lang/csharp/NOTICE
@@ -0,0 +1,6 @@
+Apache Avro
+Copyright 2010 The Apache Software Foundation
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
{code}

should be range of dates. first release to include C# was 2011. it's also the 
first year the code was accepted into the repo.

> Update LICENSE and NOTICE files included in C# release artifacts
> 
>
> Key: AVRO-1733
> URL: https://issues.apache.org/jira/browse/AVRO-1733
> Project: Avro
>  Issue Type: Sub-task
>  Components: csharp
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1733.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1732) Update LICENSE and NOTICE files included in C++ release artifacts

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940802#comment-14940802
 ] 

Sean Busbey commented on AVRO-1732:
---

+1

fine to fix below on commit:

{code}
diff --git a/lang/c++/NOTICE b/lang/c++/NOTICE
new file mode 100644
index 000..a396d4f
--- /dev/null
+++ b/lang/c++/NOTICE
@@ -0,0 +1,6 @@
+Apache Avro
+Copyright 2010 The Apache Software Foundation
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
{code}

should be range based on contributions. 2010-2014 (or I guess 2015 based on one 
of the date of assembling or this patch?)

> Update LICENSE and NOTICE files included in C++ release artifacts
> -
>
> Key: AVRO-1732
> URL: https://issues.apache.org/jira/browse/AVRO-1732
> Project: Avro
>  Issue Type: Sub-task
>  Components: c++
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1732.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1729) Update LICENSE and NOTICE files included in ruby gem

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940798#comment-14940798
 ] 

Sean Busbey commented on AVRO-1729:
---

+1, fine to fix the below on commit:

{code}
diff --git a/lang/ruby/NOTICE b/lang/ruby/NOTICE
new file mode 100644
index 000..a396d4f
--- /dev/null
+++ b/lang/ruby/NOTICE
@@ -0,0 +1,6 @@
+Apache Avro
+Copyright 2010 The Apache Software Foundation
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
{code}

should be a range of dates for when contributions were made, 2010 - 2015.

> Update LICENSE and NOTICE files included in ruby gem
> 
>
> Key: AVRO-1729
> URL: https://issues.apache.org/jira/browse/AVRO-1729
> Project: Avro
>  Issue Type: Sub-task
>  Components: ruby
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1729.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939740#comment-14939740
 ] 

Sean Busbey commented on AVRO-1728:
---

Looks like there are a couple of entries near the end of license for the tools 
jar that don't include a pointer to text in the distro: MIT and CDDL.


I thought CDDL required an entry in NOTICE? Maybe the same for MPL?

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939733#comment-14939733
 ] 

Sean Busbey commented on AVRO-1728:
---

{code}
---
+License for Apache Directory, included in this binary artifact:
+
+Copyright: 2003-2015 The Apache Software Foundation
+License: http://www.apache.org/licenses/LICENSE-2.0 (see above)
+
+Commons Math includes other works under licenses compatible with the
+Apache Software License. For more information, see:
+* https://github.com/apache/directory-server/blob/trunk/NOTICE
+
+-
{code}

Should be "Apache Directory includes other works" rather than apache commons 
math.

Same comment as above about aether or not we include the works as well.

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939731#comment-14939731
 ] 

Sean Busbey commented on AVRO-1728:
---

{code}
+HttpClient contains the following data under the terms of the MPL:
+| This project includes Public Suffix List copied from
+| <https://publicsuffix.org/list/effective_tld_names.dat>
+| licensed under the terms of the Mozilla Public License, v. 2.0
+|
+| Full license text: <http://mozilla.org/MPL/2.0/>
+
{code}

We should give a pointer here to where the full text is in the distribution. 
Ala "see above", etc

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939727#comment-14939727
 ] 

Sean Busbey commented on AVRO-1728:
---

{code}+Commons Math includes other works under licenses compatible with the
+Apache Software License. For more information, see:
+* https://github.com/apache/commons-math/blob/master/LICENSE.txt
+
{code}

Are we not including the other works here? If we are, we should have this 
portion of the LICENSE copied, right?

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939725#comment-14939725
 ] 

Sean Busbey commented on AVRO-1728:
---

{code}diff --git a/lang/java/pom.xml b/lang/java/pom.xml
index a1eea63..30d4562 100644
--- a/lang/java/pom.xml
+++ b/lang/java/pom.xml
@@ -69,6 +69,7 @@
 1.3
 3.1
 2.7
+1.3.9-1
 
 
 2.12.1
{code}

How is this related?

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1727) Update LICENSE.txt and NOTICE.txt for source distribution

2015-10-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939720#comment-14939720
 ] 

Sean Busbey commented on AVRO-1727:
---

{code}lang/java/mapred/src/test/java/org/apache/avro/hadoop/io/TestAvroSerialization.java
@@ -1,9 +1,10 @@
 /**
- * Licensed to Odiago, Inc. under one or more contributor license
- * agreements.  See the NOTICE file 
{code}

Do we know why this had the wrong name in the granted to section?

Presuming that looking at when the above came in makes it obvious that there 
was a grant to the foundation, +1

> Update LICENSE.txt and NOTICE.txt for source distribution
> -
>
> Key: AVRO-1727
> URL: https://issues.apache.org/jira/browse/AVRO-1727
> Project: Avro
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
> Attachments: AVRO-1722.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1738) add java tool for outputting schema fingerprints

2015-09-16 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14790670#comment-14790670
 ] 

Sean Busbey commented on AVRO-1738:
---

Normally notification of other apache projects wouldn't go in LICENSE (by my 
read of [the dev guidance for assembling 
LICENSE|http://www.apache.org/dev/licensing-howto.html#alv2-dep]). Those 
projects would only show up from the NOTICE carry-forward (which is the other 
bit that NOTICE is used for). LEGAL-118 certainly simplifies this tremendously; 
I'll update the patch to leave out the LICENSE/NOTICE change entirely.

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: jira admin access

2015-09-16 Thread Sean Busbey
Thanks!

-- 
Sean
Sean,

I just added Niels and you to both the Committer and Administrator
lists in Avro Jira.  I also added Tom to the Administrator list, as
that appears never to have been done.  (In Avro we've traditionally
added all committers to the Administrator list.)

Doug

On Fri, Aug 28, 2015 at 10:26 AM, Sean Busbey  wrote:
> can whomever has admin access to the avro jira tracker add me to the admin
> list?
>
> --
> Sean


[jira] [Commented] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-13 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742751#comment-14742751
 ] 

Sean Busbey commented on AVRO-1694:
---

I filed AVRO-1740 as the follow-on to add the other major fingerprint type.

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.7.8, 1.8.0, 1.9.0
>
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1740) Add CRC-64-AVRO fingerprint to Ruby implementation

2015-09-13 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1740:
-

 Summary: Add CRC-64-AVRO fingerprint to Ruby implementation
 Key: AVRO-1740
 URL: https://issues.apache.org/jira/browse/AVRO-1740
 Project: Avro
  Issue Type: Improvement
  Components: ruby
Reporter: Sean Busbey


AVRO-1694 added normalization and MD5 / SHA256, but left out CRC-64-AVRO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-13 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1694:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

pushed to 1.7+. Thanks Daniel!

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.7.8, 1.8.0, 1.9.0
>
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1739) update source repo for TLP URL

2015-09-13 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1739:
-

 Summary: update source repo for TLP URL
 Key: AVRO-1739
 URL: https://issues.apache.org/jira/browse/AVRO-1739
 Project: Avro
  Issue Type: Bug
  Components: c, c++, doc, ruby
Reporter: Sean Busbey
Priority: Minor


we have several places where we still refer to "the avro website" as 
http://hadoop.apache.org/avro

{code}
avro busbey$ grep -rl "http://hadoop.apache.org/avro/"; *
doc/src/content/xdocs/site.xml
doc/src/content/xdocs/tabs.xml
lang/c/AUTHORS
lang/c/docs/index.txt
lang/c/NEWS
lang/c++/AUTHORS
lang/c++/MainPage.dox
lang/c++/NEWS
lang/ruby/Rakefile
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-12 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742140#comment-14742140
 ] 

Sean Busbey commented on AVRO-1694:
---

copy of my comment from PR-40:

{quote}
I used the proposed tool in [AVRO-1738]() and IRB to verify the given MD5 and 
SHA-256 for the int schema and things look fine.

{code}
busbey2-MBA:avro-help busbey$ echo '"int"' | java -jar 
avro-tools-1.9.0-SNAPSHOT.jar fingerprint  -
8f5c393f1ad57572 -
busbey2-MBA:avro-help busbey$ echo '{"type":"int"}' | java -jar 
avro-tools-1.9.0-SNAPSHOT.jar fingerprint  -
8f5c393f1ad57572 -
busbey2-MBA:avro-help busbey$ echo '{"type":"int"}' | java -jar 
avro-tools-1.9.0-SNAPSHOT.jar fingerprint --fingerprint MD5 -
ef524ea1b91e73173d938ade36c1db32 -
busbey2-MBA:avro-help busbey$ echo '"int"' | java -jar 
avro-tools-1.9.0-SNAPSHOT.jar fingerprint --fingerprint MD5  -
ef524ea1b91e73173d938ade36c1db32 -
busbey2-MBA:avro-help busbey$ echo '"int"' | java -jar 
avro-tools-1.9.0-SNAPSHOT.jar fingerprint --fingerprint SHA-256  -
3f2b87a9fe7cc9b13835598c3981cd45e3e355309e5090aa0933d7becb6fba45 -
busbey2-MBA:avro-help busbey$ irb
jruby-1.7.3 :001 > 'ef524ea1b91e73173d938ade36c1db32'.to_i(16)
 => 318112854175969537208795771590915775282 
jruby-1.7.3 :002 > 
'3f2b87a9fe7cc9b13835598c3981cd45e3e355309e5090aa0933d7becb6fba45'.to_i(16)
 => 
28572620203319713300323544804233350633246234624932075150020181448463213378117 
{code}

I'd like some additional test schemas, but I'm fine with those as a follow-on.

As an aside, from looking at the shared test data for schema normalization it's 
going to be awkward to use as-is since it relies on converting to signed-longs 
instead of a hex string.

I'll push this in a bit unless someone else has concerns.
{quote}

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.7.8, 1.8.0, 1.9.0
>
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1694:
--
Assignee: Daniel Schierbeck

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.7.8, 1.8.0, 1.9.0
>
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1694:
--
Status: Patch Available  (was: Open)

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1694:
--
Fix Version/s: 1.9.0
   1.8.0
   1.7.8

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.7.8, 1.8.0, 1.9.0
>
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1694) Support for schema fingerprints in the Ruby library

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1694:
--
Attachment: AVRO-1694.1.patch

Attaching Daniel's current patch from PR-40.

> Support for schema fingerprints in the Ruby library
> ---
>
> Key: AVRO-1694
> URL: https://issues.apache.org/jira/browse/AVRO-1694
> Project: Avro
>  Issue Type: Wish
>  Components: ruby
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.7.8, 1.8.0, 1.9.0
>
> Attachments: AVRO-1694.1.patch
>
>
> There does not seem to be any support for generating schema fingerprints in 
> the Ruby library. In order to avoid inlining schemas in my Avro-encoded 
> messages I'd like to store them separately and instead write the fingerprint 
> in the Avro metadata, thus allowing a reader to fetch and cache the actual 
> schema from the schema registry.
> In order for that to work, my Ruby writer needs to be able to actually 
> generate a fingerprint for a schema.
> Is the Ruby library being actively maintained? I would be willing to work on 
> this myself if someone would review and merge the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1738) add java tool for outputting schema fingerprints

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1738:
--
Status: Patch Available  (was: In Progress)

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>    Reporter: Sean Busbey
>    Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1738) add java tool for outputting schema fingerprints

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1738:
--
Attachment: AVRO-1738.1.patch

-01

* add 'fingerprint' to tool
* 'fingerprint' takes file or stdin, parses schema, passes to 
SchemaNormalization
* default fingerprint to CRC-64-AVRO
* output as hex string, based on code copied from commons-codec

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>    Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (AVRO-1738) add java tool for outputting schema fingerprints

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AVRO-1738 started by Sean Busbey.
-
> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>    Reporter: Sean Busbey
>    Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1738) add java tool for outputting schema fingerprints

2015-09-12 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1738:
-

 Summary: add java tool for outputting schema fingerprints
 Key: AVRO-1738
 URL: https://issues.apache.org/jira/browse/AVRO-1738
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 1.7.8, 1.8.0


over in AVRO-1694 I wanted to quickly check that the Java library came up with 
the same md5/sha fingerprint for some shcemas that the proposed Ruby 
implementation does.

I noticed we don't have a tool that exposes the functionality yet, which seems 
like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (AVRO-1709) [Ruby] Ignore generated files

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved AVRO-1709.
---
   Resolution: Fixed
Fix Version/s: 1.8.0
   1.7.8
   1.9.0

I pushed this to trunk, branch-1.8, and branch-1.7. Thanks Daniel!

> [Ruby] Ignore generated files
> -
>
> Key: AVRO-1709
> URL: https://issues.apache.org/jira/browse/AVRO-1709
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Fix For: 1.9.0, 1.7.8, 1.8.0
>
> Attachments: AVRO-1709.1.patch
>
>
> Neither data.avr nor Gemfile.lock should be committed to the repository – 
> adding them to .gitignore thus makes it easier to work on the Ruby library.
> PR: https://github.com/apache/avro/pull/45   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1709) [Ruby] Ignore generated files

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1709:
--
Attachment: AVRO-1709.1.patch

Attaching Daniel's patch from the PR.

> [Ruby] Ignore generated files
> -
>
> Key: AVRO-1709
> URL: https://issues.apache.org/jira/browse/AVRO-1709
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
> Attachments: AVRO-1709.1.patch
>
>
> Neither data.avr nor Gemfile.lock should be committed to the repository – 
> adding them to .gitignore thus makes it easier to work on the Ruby library.
> PR: https://github.com/apache/avro/pull/45   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1709) [Ruby] Ignore generated files

2015-09-12 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1709:
--
Assignee: Daniel Schierbeck

> [Ruby] Ignore generated files
> -
>
> Key: AVRO-1709
> URL: https://issues.apache.org/jira/browse/AVRO-1709
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Daniel Schierbeck
>
> Neither data.avr nor Gemfile.lock should be committed to the repository – 
> adding them to .gitignore thus makes it easier to work on the Ruby library.
> PR: https://github.com/apache/avro/pull/45   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738112#comment-14738112
 ] 

Sean Busbey commented on AVRO-1722:
---

the binary license changes included adding {{src/main/resources}} as a 
resources dir for avro-tools, but didn't actually add that directory. missing 
new files?

it also excluded LICENSE and NOTICE files from dependencies. I'm used to doing 
the LICENSE file call outs manually (or via velocity). In the case of NOTICE 
files, I'm used to using the shade plugin's [Apache Notice Resource 
Transformer|http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ApacheNoticeResourceTransformer]
 to aggregate NOTICE files. An alternative option is to use the maven 
dependency plugin to unpack all the NOTICE files and then concat them.

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738106#comment-14738106
 ] 

Sean Busbey commented on AVRO-1722:
---

{code}
lang/java/tools/src/test/compiler/output-string/avro/examples/baseball/Player.java
 @@ -2,6 +2,22 @@
  * Autogenerated by Avro
  * 
  * DO NOT EDIT DIRECTLY
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
  */
{code}

This is a test output of the compiler and I don't think the compiler actually 
actually includes the ASF header in the compiler output for this input  (or the 
other examples) , right?

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738102#comment-14738102
 ] 

Sean Busbey commented on AVRO-1722:
---

also, we have a few references to "the MIT license" but we don't have a copy of 
the text included yet.

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738100#comment-14738100
 ] 

Sean Busbey commented on AVRO-1722:
---

{code}

+--
+License for jQuery v1.4.2 used by the Java IPC implementation:
+
+Copyright 2010, John Resig
+Dual licensed under the MIT or GPL Version 2 licenses.
+http://jquery.org/license
+
+Includes Sizzle.js
+http://sizzlejs.com/
+Copyright 2010, The Dojo Foundation
+Released under the MIT, BSD, and GPL Licenses.
{code}

we have to say which of the available licenses we're distributing under. 
(presumably MIT for both)

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738097#comment-14738097
 ] 

Sean Busbey commented on AVRO-1722:
---

in the source license, could we collapse all the libtool inclusions to a single 
copy of the "This file is a part of GNU Libtool, etc" block by listing the 
files, their copyright dates, and a note that they all are included under ASLv2 
per a single copy of the text?

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738084#comment-14738084
 ] 

Sean Busbey commented on AVRO-1722:
---

how about sub tasks per language (and the source tarball) that we can close out 
as we make progress?

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1642) JVM Spec Violation 255 Parameter Limit Exceeded

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738075#comment-14738075
 ] 

Sean Busbey commented on AVRO-1642:
---

{code}
+  /*
+   * Returns the number of parameter units required by fields for the 
AllArgsConstructor
+   */
+  protected int calcAllArgConstructorParameterUnits(Schema record) {
{code}

Add a javadoc that the schema needs to be a Record, add an input validation 
that the schema is a Record, add a failure test for same.

{code}
+  switch(f.schema().getType()){
+case DOUBLE:
+case FLOAT:
+  parameterUnits += 2; // double & float types contribute 2 parameter 
units
+  break;
+default:
+  parameterUnits += 1; // all other types contribute 1 parameter unit
+  }
{code}

Types that need a java long or double count as 2, everything else is 1. 
Important to note that it's only the primitive types long and double.

Per [generic java api 
docs|http://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/generic/package-summary.html],
 Avro is going to generate Double and Long objects instead, which should count 
as 1 parameter unit.

My apologies for pointing you towards the JVM note earlier. For some reason at 
the time I thought specific would use primitives.

{code}
+  /*
+   * @throws RuntimeException generated code for specified record will be not 
compile.
+   */
+  protected void validateRecordForCompilation(Schema record) {
+this.createAllArgsConstructor =
+calcAllArgConstructorParameterUnits(record) <= 
MAX_FIELD_PARAMETER_UNIT_COUNT;
+  }
{code}

please remove the javadoc throws, since this does not throw. Also please log a 
warning when createAllArgsConstructor will be false.

{code}
+#if ($this.isCreateAllArgsConstructor())
 
   /**
* All-args constructor.
@@ -80,6 +81,7 @@ public class ${this.mangle($schema.getName())}#if 
($schema.isError()) extends or
 #end
   }
 #end
+#end
{code}

When we're not going to create the all-args constructor, please output a java 
comment that explains why we aren't making it, a pointer to using the builder 
pattern, etc.

{quote}
rename of 
lang/java/compiler/src/test/java/org/apache/avro/compiler/TestSpecificCompiler.java
 to 
lang/java/compiler/src/test/java/org/apache/avro/compiler/specific/TestSpecificCompiler.java
{quote}

please don't include the rename in this patch, since it makes it hard to see 
what changes are relevant to this patch (we should do this rename in a 
different JIRA, we can do it first if it makes this patch easier).

{code}
+  @Test
+  public void testMaxParameterCounts() throws Exception {
+Schema validSchema1 = 
createSampleRecordSchema(SpecificCompiler.MAX_FIELD_PARAMETER_UNIT_COUNT, 0);
+new SpecificCompiler(validSchema1).compile();
+Schema validSchema2 = 
createSampleRecordSchema(SpecificCompiler.MAX_FIELD_PARAMETER_UNIT_COUNT - 2, 
1);
+new SpecificCompiler(validSchema2).compile();
+Schema validSchema3 = 
createSampleRecordSchema(SpecificCompiler.MAX_FIELD_PARAMETER_UNIT_COUNT - 1, 
1);
+new SpecificCompiler(validSchema3).compile();
+Schema validSchema4 = 
createSampleRecordSchema(SpecificCompiler.MAX_FIELD_PARAMETER_UNIT_COUNT + 1, 
0);
+new SpecificCompiler(validSchema4).compile();
+  }
{code}

Include comments of what you are attempting to test in these calls, 
expectations of success/failure, etc.

{code}
-  
+
   /** Uses the system's java compiler to actually compile the generated code. 
*/
-  static void assertCompilesWithJavaCompiler(Collection outputs) 
+  static void assertCompilesWithJavaCompiler(Collection outputs)
   throws IOException {
 if (outputs.isEmpty()) {
   return;   // Nothing to compile!
@@ -730,13 +757,13 @@ public class TestSpecificCompiler {
 }
 
 JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
-StandardJavaFileManager fileManager = 
-  compiler.getStandardFileManager(null, null, null);
-
-CompilationTask cTask = compiler.getTask(null, fileManager, null, null, 
-null,
-fileManager.getJavaFileObjects(
+StandardJavaFileManager fileManager =
+compiler.getStandardFileManager(null, null, null);
+
+CompilationTask cTask = compiler.getTask(null, fileManager, null, null,
+null, fileManager.getJavaFileObjects(
 javaFiles.toArray(new File[javaFiles.size()])));
-assertTrue(cTask.call());
+boolean compilesWithoutError = cTask.call();
+assertTrue(compilesWithoutError);
{code}

Please leave out unrelated code reformatting.

{quote}
lang/java/tools/src/test/compiler/output-no-constructor/*
{quote}

Where are these coming from / used?

> JVM Spec Violation 255 Parameter Limit Exceeded 
> 
>
> Key: AVRO-1642
> URL: https://issues.apa

[jira] [Commented] (AVRO-1642) JVM Spec Violation 255 Parameter Limit Exceeded

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738044#comment-14738044
 ] 

Sean Busbey commented on AVRO-1642:
---

{code}
   public static void compileSchema(File src, File dest) throws IOException {
-compileSchema(new File[] {src}, dest);
+compileSchema(new File[]{src}, dest);
{code}

please don't make unrelated formatting fixes.

> JVM Spec Violation 255 Parameter Limit Exceeded 
> 
>
> Key: AVRO-1642
> URL: https://issues.apache.org/jira/browse/AVRO-1642
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Windows/Linux all Java
>Reporter: Bryce Alcock
>Assignee: Prateek Rungta
>Priority: Critical
>  Labels: build, maven, specific
> Attachments: AVRO-1642-0.patch, AVRO-1642-1.patch, avro-1642-fail.tar
>
>
> The JVM Spec indicates that:
> {quote}The number of method parameters is limited to 255 by the definition of 
> a method descriptor (§4.3.3), where the limit includes one unit for this in 
> the case of instance or interface method invocations. Note that a method 
> descriptor is defined in terms of a notion of method parameter length in 
> which a parameter of type long or double contributes two units to the length, 
> so parameters of these types further reduce the limit. {quote}
> Avro Generated Java code with say more than 255 fields will create a 
> constructor that is not valid and won't compile.
> Simple test is to create a 256 field avro schema, use the avro-maven auto 
> code gen plugin, and try to compile the resulting class.
> DON'T use linux when doing this use windows, my suspicion is that Linux JavaC 
> generates invalid byte code but does not complain.
> Windows will correctly complain indicating that you are a violator of the JVM 
> specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1642) JVM Spec Violation 255 Parameter Limit Exceeded

2015-09-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738043#comment-14738043
 ] 

Sean Busbey commented on AVRO-1642:
---

{code}
+  private static final int JVM_METHOD_ARG_LIMIT = 255;
+  public static final int MAX_FIELD_PARAMETER_UNIT_COUNT = 
JVM_METHOD_ARG_LIMIT - 1;
+
{code}

Can you make this package protected instead of public and add a note that it's 
only visible for testing?

> JVM Spec Violation 255 Parameter Limit Exceeded 
> 
>
> Key: AVRO-1642
> URL: https://issues.apache.org/jira/browse/AVRO-1642
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Windows/Linux all Java
>Reporter: Bryce Alcock
>Assignee: Prateek Rungta
>Priority: Critical
>  Labels: build, maven, specific
> Attachments: AVRO-1642-0.patch, AVRO-1642-1.patch, avro-1642-fail.tar
>
>
> The JVM Spec indicates that:
> {quote}The number of method parameters is limited to 255 by the definition of 
> a method descriptor (§4.3.3), where the limit includes one unit for this in 
> the case of instance or interface method invocations. Note that a method 
> descriptor is defined in terms of a notion of method parameter length in 
> which a parameter of type long or double contributes two units to the length, 
> so parameters of these types further reduce the limit. {quote}
> Avro Generated Java code with say more than 255 fields will create a 
> constructor that is not valid and won't compile.
> Simple test is to create a 256 field avro schema, use the avro-maven auto 
> code gen plugin, and try to compile the resulting class.
> DON'T use linux when doing this use windows, my suspicion is that Linux JavaC 
> generates invalid byte code but does not complain.
> Windows will correctly complain indicating that you are a violator of the JVM 
> specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1726) Add support for appending a variable number of blocks to DataFileWriter

2015-09-04 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730989#comment-14730989
 ] 

Sean Busbey commented on AVRO-1726:
---

{code}
+  /**
+   * Appends blocks from another file.  otherFile must have the same schema.
* Data blocks will be copied without de-serializing data.  If the codecs
* of the two files are compatible, data blocks are copied directly without
* decompression.  If the codecs are not compatible, blocks from otherFile
@@ -354,11 +365,13 @@
* deflate at compression level 7.  If recompress is false, blocks
* will be copied without changing the compression level.  If true, they will
* be converted to the new compression level.
-   * @param otherFile
-   * @param recompress
+   * @param otherFile the file to append from
+   * @param recompress whether or not to recompress blocks as they are appended
+   * @param numBlocks the number of blocks to append
+   * @return true if otherFile has more blocks available, false otherwise
* @throws IOException
*/
-  public void appendAllFrom(DataFileStream otherFile, boolean recompress) 
throws IOException {
+  public boolean appendBlocksFrom(DataFileStream otherFile, boolean 
recompress, long numBlocks) throws IOException {
{code}

Is it worth adding a block-number-offset so that the method can be used to 
handle arbitrary "pull a contiguous sub-set of blocks"?

> Add support for appending a variable number of blocks to DataFileWriter
> ---
>
> Key: AVRO-1726
> URL: https://issues.apache.org/jira/browse/AVRO-1726
> Project: Avro
>  Issue Type: Improvement
>Affects Versions: 1.7.7
>Reporter: Bryan Bende
>Priority: Minor
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1726.patch
>
>
> It would be helpful to have the ability to append a variable number of raw 
> blocks from a DataFileReader to a DataFileWriter, similar to appendAllFrom() 
> but specifying how many blocks to append.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-02 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727434#comment-14727434
 ] 

Sean Busbey commented on AVRO-1722:
---

{code}
+share/rat-excludes.txt
{code}

with the move from ant to the maven plugin, we don't need this file or the 
associated exclusion now, right?

{code}

+CHANGES.txt
+DIST_README.txt
+lang/perl/Changes
+lang/c/README.maintaining_win32.txt
+lang/c/docs/index.txt
+lang/csharp/README
+
lang/java/archetypes/avro-service-archetype/src/test/integration/projects/basic/archetype.properties
 
{code}

is there a reason we can convert these to markdown and add a commented-out 
license header?

{code}
+
+
+**/.git/**
+**/.gitignore
+
+**/target/**
{code}

this ought to be getting excluded already via the plugin's default exclude 
filter. is it currently failing?

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1722) Fix licensing issues

2015-09-02 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727421#comment-14727421
 ] 

Sean Busbey commented on AVRO-1722:
---

Have you done a grep already for "[Cc]opied from" ?

> Fix licensing issues
> 
>
> Key: AVRO-1722
> URL: https://issues.apache.org/jira/browse/AVRO-1722
> Project: Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
>Priority: Blocker
> Fix For: 1.8.0
>
>
> The 1.8.0 release vote turned up a lot of licensing issues that need to be 
> fixed. We need to do a scrub of the licensing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1720) Add an avro-tool to count records in an avro file

2015-09-02 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1720:
--
Status: Patch Available  (was: Open)

> Add an avro-tool to count records in an avro file
> -
>
> Key: AVRO-1720
> URL: https://issues.apache.org/jira/browse/AVRO-1720
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Janosch Woschitz
>Priority: Minor
> Attachments: AVRO-1720-with-extended-unittests.patch, AVRO-1720.patch
>
>
> If you're dealing with bigger avro files (>100MB) it would be nice to have a 
> way to quickly count the amount of records contained within that file.
> With the current state of avro-tools the only way to achieve this (to my 
> current knowledge) is to dump the data to json and count the amount of 
> records. For bigger files this might take a while due to the serialization 
> overhead and since every record needs to be looked at.
> I added a new tool which is optimized for counting records, it does not 
> serialize the records and reads only the block count for each block.
> {panel:title=Naive benchmark}
> {noformat}
> # the input file had a size of ~300MB
> $ du -sh sample.avro 
> 323Msample.avro
> # using the new count tool
> $ time java -jar avro-tools.jar count sample.avro
> 331439
> real0m4.670s
> user0m6.167s
> sys 0m0.513s
> # the current way of counting records
> $ time java -jar avro-tools.jar tojson sample.avro | wc
> 331439 54904484 1838231743
> real0m52.760s
> user1m42.317s
> sys 0m3.209s
> # the overhead of wc is rather minor
> $ time java -jar avro-tools.jar tojson sample.avro > /dev/null
> real0m47.834s
> user0m53.317s
> sys 0m1.194s
> {noformat}
> {panel}
> This tool uses the HDFS API to handle files from any supported filesystem. I 
> added the unit tests to the already existing TestDataFileTools since it 
> provided convenient utility functions which I could reuse for my test 
> scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1642) JVM Spec Violation 255 Parameter Limit Exceeded

2015-08-28 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1642:
--
Labels: build maven specific  (was: build maven)

> JVM Spec Violation 255 Parameter Limit Exceeded 
> 
>
> Key: AVRO-1642
> URL: https://issues.apache.org/jira/browse/AVRO-1642
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Windows/Linux all Java
>Reporter: Bryce Alcock
>Priority: Critical
>  Labels: build, maven, specific
>
> The JVM Spec indicates that:
> {quote}The number of method parameters is limited to 255 by the definition of 
> a method descriptor (§4.3.3), where the limit includes one unit for this in 
> the case of instance or interface method invocations. Note that a method 
> descriptor is defined in terms of a notion of method parameter length in 
> which a parameter of type long or double contributes two units to the length, 
> so parameters of these types further reduce the limit. {quote}
> Avro Generated Java code with say more than 255 fields will create a 
> constructor that is not valid and won't compile.
> Simple test is to create a 256 field avro schema, use the avro-maven auto 
> code gen plugin, and try to compile the resulting class.
> DON'T use linux when doing this use windows, my suspicion is that Linux JavaC 
> generates invalid byte code but does not complain.
> Windows will correctly complain indicating that you are a violator of the JVM 
> specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1642) JVM Spec Violation 255 Parameter Limit Exceeded

2015-08-28 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720287#comment-14720287
 ] 

Sean Busbey commented on AVRO-1642:
---

so long as we're making correct byte code, you should also count double and 
long fields as two "parameter units"

> JVM Spec Violation 255 Parameter Limit Exceeded 
> 
>
> Key: AVRO-1642
> URL: https://issues.apache.org/jira/browse/AVRO-1642
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Windows/Linux all Java
>Reporter: Bryce Alcock
>Priority: Critical
>  Labels: build, maven
>
> The JVM Spec indicates that:
> {quote}The number of method parameters is limited to 255 by the definition of 
> a method descriptor (§4.3.3), where the limit includes one unit for this in 
> the case of instance or interface method invocations. Note that a method 
> descriptor is defined in terms of a notion of method parameter length in 
> which a parameter of type long or double contributes two units to the length, 
> so parameters of these types further reduce the limit. {quote}
> Avro Generated Java code with say more than 255 fields will create a 
> constructor that is not valid and won't compile.
> Simple test is to create a 256 field avro schema, use the avro-maven auto 
> code gen plugin, and try to compile the resulting class.
> DON'T use linux when doing this use windows, my suspicion is that Linux JavaC 
> generates invalid byte code but does not complain.
> Windows will correctly complain indicating that you are a violator of the JVM 
> specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1642) JVM Spec Violation 255 Parameter Limit Exceeded

2015-08-28 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720286#comment-14720286
 ] 

Sean Busbey commented on AVRO-1642:
---

You can feel free to start working on this now. Once I get jira permissions 
straightened out I'll assign it to you.

I think "doesn't make a constructor" is a better and easier to work with 
failure than "generates invalid bytecodes", so +1 on the approach. We should be 
sure to generate a warning of some kind when this happens though, since it will 
likely be surprising for users adding their 256th field.

Can we start with coming up with a test that fails? I imagine that even if the 
sun compiler on linux generates invalid byte code it should fail when 
attempting to load the class.

> JVM Spec Violation 255 Parameter Limit Exceeded 
> 
>
> Key: AVRO-1642
> URL: https://issues.apache.org/jira/browse/AVRO-1642
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Windows/Linux all Java
>Reporter: Bryce Alcock
>Priority: Critical
>  Labels: build, maven
>
> The JVM Spec indicates that:
> {quote}The number of method parameters is limited to 255 by the definition of 
> a method descriptor (§4.3.3), where the limit includes one unit for this in 
> the case of instance or interface method invocations. Note that a method 
> descriptor is defined in terms of a notion of method parameter length in 
> which a parameter of type long or double contributes two units to the length, 
> so parameters of these types further reduce the limit. {quote}
> Avro Generated Java code with say more than 255 fields will create a 
> constructor that is not valid and won't compile.
> Simple test is to create a 256 field avro schema, use the avro-maven auto 
> code gen plugin, and try to compile the resulting class.
> DON'T use linux when doing this use windows, my suspicion is that Linux JavaC 
> generates invalid byte code but does not complain.
> Windows will correctly complain indicating that you are a violator of the JVM 
> specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


jira admin access

2015-08-28 Thread Sean Busbey
can whomever has admin access to the avro jira tracker add me to the admin
list?

-- 
Sean


[jira] [Commented] (AVRO-1720) Add an avro-tool to count records in an avro file

2015-08-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709807#comment-14709807
 ] 

Sean Busbey commented on AVRO-1720:
---

{code}
+  try {
+while (streamReader.nextBlock() != null) {
+  count += streamReader.getBlockCount();
+}
+  } catch (NoSuchElementException e) {
+// no op
+  }
{code}

what throws this?

> Add an avro-tool to count records in an avro file
> -
>
> Key: AVRO-1720
> URL: https://issues.apache.org/jira/browse/AVRO-1720
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Janosch Woschitz
>Priority: Minor
> Attachments: AVRO-1720.patch
>
>
> If you're dealing with bigger avro files (>100MB) it would be nice to have a 
> way to quickly count the amount of records contained within that file.
> With the current state of avro-tools the only way to achieve this (to my 
> current knowledge) is to dump the data to json and count the amount of 
> records. For bigger files this might take a while due to the serialization 
> overhead and since every record needs to be looked at.
> I added a new tool which is optimized for counting records, it does not 
> serialize the records and reads only the block count for each block.
> {panel:title=Naive benchmark}
> {noformat}
> # the input file had a size of ~300MB
> $ du -sh sample.avro 
> 323Msample.avro
> # using the new count tool
> $ time java -jar avro-tools.jar count sample.avro
> 331439
> real0m4.670s
> user0m6.167s
> sys 0m0.513s
> # the current way of counting records
> $ time java -jar avro-tools.jar tojson sample.avro | wc
> 331439 54904484 1838231743
> real0m52.760s
> user1m42.317s
> sys 0m3.209s
> # the overhead of wc is rather minor
> $ time java -jar avro-tools.jar tojson sample.avro > /dev/null
> real0m47.834s
> user0m53.317s
> sys 0m1.194s
> {noformat}
> {panel}
> This tool uses the HDFS API to handle files from any supported filesystem. I 
> added the unit tests to the already existing TestDataFileTools since it 
> provided convenient utility functions which I could reuse for my test 
> scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1676) Do not treat enum symbols as immutable when deep copying

2015-08-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1676:
--
Status: Patch Available  (was: Open)

> Do not treat enum symbols as immutable when deep copying
> 
>
> Key: AVRO-1676
> URL: https://issues.apache.org/jira/browse/AVRO-1676
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Mike Rodriguez
> Attachments: AVRO-1676.1.patch
>
>
> Enum types should be supported in GenericData.deepCopy() so that we can 
> convert from (in memory) generics records to specifics records.
> Without this fix in place, it is non-trivial to attempt to create specific 
> records from generic records if they may happen to have any nested enum data 
> types.
> This issue has already been pointed out by Doug Cutting which can be 
> referenced at \[1] and \[2].
> In \[2] specifically, Doug proposed the change that would fix this issue.
> \[1] 
> https://issues.apache.org/jira/browse/AVRO-1455?focusedCommentId=13899535&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13899535
> \[2] 
> http://mail-archives.apache.org/mod_mbox/avro-user/201402.mbox/%3ccaleq1z9skctci0h8rtntyjabdm1hn8yuudtr2ft+czrvhto...@mail.gmail.com%3E
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1676) Do not treat enum symbols as immutable when deep copying

2015-08-19 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated AVRO-1676:
--
Attachment: AVRO-1676.1.patch

Here's [~cutting]'s proposed patch as a patch rebased onto current trunk.

passes {{mvn -Dtest=TestSpecificData,TestGenericData clean package}} locally.

> Do not treat enum symbols as immutable when deep copying
> 
>
> Key: AVRO-1676
> URL: https://issues.apache.org/jira/browse/AVRO-1676
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Mike Rodriguez
> Attachments: AVRO-1676.1.patch
>
>
> Enum types should be supported in GenericData.deepCopy() so that we can 
> convert from (in memory) generics records to specifics records.
> Without this fix in place, it is non-trivial to attempt to create specific 
> records from generic records if they may happen to have any nested enum data 
> types.
> This issue has already been pointed out by Doug Cutting which can be 
> referenced at \[1] and \[2].
> In \[2] specifically, Doug proposed the change that would fix this issue.
> \[1] 
> https://issues.apache.org/jira/browse/AVRO-1455?focusedCommentId=13899535&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13899535
> \[2] 
> http://mail-archives.apache.org/mod_mbox/avro-user/201402.mbox/%3ccaleq1z9skctci0h8rtntyjabdm1hn8yuudtr2ft+czrvhto...@mail.gmail.com%3E
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Avro release 1.8.0 (rc0)

2015-08-14 Thread Sean Busbey
-1 non-binding

* checksums match
* verified signatures with Tom's key from svn

I haven't gotten to finish going through everything, but I've already found
several places where we aren't meeting the ASF policy on license
notification[1]. HBase recently had to deal with a similar morass[2], so I
have some familiarity in fixing these things, but I'm afraid I don't have
bandwidth until late next week.

1) The following proposed distribution artifacts are completely missing
LICENSE/NOTICE files at the top level:

  * avro-csharp

  * avro-doc

  * py

  * python3

  * ruby gem

2) The Perl distribution artifact is missing the LICENSE file. It has a
NOTICE file, but that file declares Copyright (C) 2010 Yann Kerherve rather
than meeting the ASF policy.

3) The source distribution artifact is missing some LICENSE/NOTICE entries
for bundled third party works. I haven't gotten to exhaustively check, but
e.g.

* avro-src-1.8.0/lang/c/jansson/install-sh
* avro-src-1.8.0/lang/c/jansson/configure
* avro-src-1.8.0/lang/c++/m4/m4_ax_boost_base.m4
*
avro-src-1.8.0/lang/java/ipc/src/main/java/org/apache/avro/ipc/stats/static/jquery-1.4.2.min.js.

4) The c and c++ distribution artifacts have a single COPYING file
containing the ASLv2 text rather than the ASF required LICENSE/NOTICE.
additionally they are both missing needed LICENSE/NOTICE entries based on
what's present.

5) The php distribution artifact has LICENSE/NOTICE files that include
entries about the c and c# implementation, which are not actually bundled.

6) The avro-ipc-sources jar bundles jquery but does not mention it in the
META-INF/LICENSE file

7) The avro-tool fat jar has several extraneous license files and the
META-INF/LICENSE & META-INF/NOTICE do not properly reflect bundled works

e.g.
 * the top of the jar has a LICENSE.txt/NOTICE.txt file from some
dependency that should not be present
 * META-INF contains both LICENSE and LICENSE.txt, both are vanilla ASLv2.
we should have just LICENSE and it should include needed additions for
bundled works
 * META-INF/ASLv2 is another extraneous copy of the ASLv2 text
 * META-INF/NOTICE.txt appears to be from commons-compress
 * the majority of bundled works are not reflected at all (spot checking
shows e.g. some MIT and BSD that require entries in LICENSE)

nits:

a) The c# distribution artifact untars directly into the working directory.


[1]: http://www.apache.org/dev/licensing-howto.html
[2]: https://issues.apache.org/jira/browse/HBASE-14085

On Fri, Aug 14, 2015 at 1:28 AM, Sean Busbey  wrote:

> I'm working through verifying the release artifacts, but I can't verify
> the signatures because the KEYS file only contains Doug and Jeff H.
>
> Tom is your key published on people.apache? or could you add yourself to
> the avro dist KEYS file?
>
> On Tue, Aug 11, 2015 at 5:39 AM, Tom White  wrote:
>
>> I have created a candidate build for Avro release 1.8.0. This is the first
>> release that has been built using Docker.
>>
>> The changes are listed at:
>> http://s.apache.org/avro180
>>
>> The release artifacts can be found here:
>> http://people.apache.org/~tomwhite/avro-1.8.0-rc0/
>>
>> The tag corresponding to this release candidate is
>> http://svn.apache.org/repos/asf/avro/tags/release-1.8.0-rc0
>>
>> You can find the KEYS file here:
>> https://dist.apache.org/repos/dist/release/avro/KEYS
>>
>> The Maven staging repository is at:
>> https://repository.apache.org/content/repositories/orgapacheavro-1002
>>
>> Please download, verify, and test. Thanks in advance for voting!
>>
>> Cheers,
>> Tom
>>
>
>
>
> --
> Sean
>



-- 
Sean


Re: [VOTE] Avro release 1.8.0 (rc0)

2015-08-13 Thread Sean Busbey
I'm working through verifying the release artifacts, but I can't verify the
signatures because the KEYS file only contains Doug and Jeff H.

Tom is your key published on people.apache? or could you add yourself to
the avro dist KEYS file?

On Tue, Aug 11, 2015 at 5:39 AM, Tom White  wrote:

> I have created a candidate build for Avro release 1.8.0. This is the first
> release that has been built using Docker.
>
> The changes are listed at:
> http://s.apache.org/avro180
>
> The release artifacts can be found here:
> http://people.apache.org/~tomwhite/avro-1.8.0-rc0/
>
> The tag corresponding to this release candidate is
> http://svn.apache.org/repos/asf/avro/tags/release-1.8.0-rc0
>
> You can find the KEYS file here:
> https://dist.apache.org/repos/dist/release/avro/KEYS
>
> The Maven staging repository is at:
> https://repository.apache.org/content/repositories/orgapacheavro-1002
>
> Please download, verify, and test. Thanks in advance for voting!
>
> Cheers,
> Tom
>



-- 
Sean


[jira] [Created] (AVRO-1706) Ensure Java and cross-implementation check on simple fingerprints

2015-07-16 Thread Sean Busbey (JIRA)
Sean Busbey created AVRO-1706:
-

 Summary: Ensure Java and cross-implementation check on simple 
fingerprints
 Key: AVRO-1706
 URL: https://issues.apache.org/jira/browse/AVRO-1706
 Project: Avro
  Issue Type: Improvement
  Components: build, java
Reporter: Sean Busbey
Assignee: Sean Busbey


Make sure we have MD5 and SHA256 fingerprints for normalized schemas in Java 
and interop tests set up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   >