Re: AVRO Best Practices for Sparse object storage

2020-06-26 Thread Doug Cutting
A map schema might be appropriate. Another idea might be to define a record for every field, then use an array whose values are a union of all these records. This is a bit more complicated but would probably use the least space. Doug On Thu, Jun 25, 2020 at 4:14 PM Sohail Khan wrote: > Hello

Re: schema resolution vs logical types

2020-04-09 Thread Doug Cutting
On Wed, Apr 8, 2020 at 5:03 AM roger peppe wrote: > On Tue, 7 Apr 2020 at 17:57, Doug Cutting wrote: > >> On Tue, Apr 7, 2020 at 4:03 AM roger peppe wrote: >> >>> On the one hand the specification says >>> <https://avro.apache.org/docs/1.9.1/spec.

Re: schema resolution vs logical types

2020-04-07 Thread Doug Cutting
On Tue, Apr 7, 2020 at 4:03 AM roger peppe wrote: > On the one hand the specification says > > : > > If the Parsing Canonical Forms of two different schemas are textually >> equal, then those schemas are "the same"

Re: Is there a way to skip field decoding without materializing the data?

2020-04-03 Thread Doug Cutting
Yes, as Roger suggests, in Avro this is done by specifying a reader schema that contains a subset of the fields written. E.g., https://icircuit.net/avro-schema-projection/1446 Doug On Fri, Apr 3, 2020 at 8:09 AM roger peppe wrote: > If you're using a custom codec, this is potentially

Re: defaults for complex types (was Re: recursive types)

2019-12-06 Thread Doug Cutting
On Fri, Dec 6, 2019 at 2:38 AM Ryan Skraba wrote: > Naively, I would expect any JSON encoded data to be a valid default > value (which is not what the spec says). Does anyone know why the > "first schema only" rule was added to the spec? > I think we felt this would make things simpler. That

Re: Should a Schema be serializable in Java?

2019-07-15 Thread Doug Cutting
I can't think of a reason Schema should not implement Serializable. There's actually already an issue & patch for this: https://issues.apache.org/jira/browse/AVRO-1852 Doug On Mon, Jul 15, 2019 at 6:49 AM Ismaël Mejía wrote: > +d...@avro.apache.org > > On Mon, Jul 15, 2019 at 3:30 PM Ryan

Re: Alias with Backward Compatibility

2018-10-22 Thread Doug Cutting
liases should be avoided > for forward compatibility, but are fine for backward compatibility? If you > want full compatibility, you should think long and hard about your field > names. > > On Mon, Oct 22, 2018 at 2:05 PM Doug Cutting wrote: > >> Despite the change in field n

Re: Alias Issue

2018-10-15 Thread Doug Cutting
around having two versions around for testing purposes? > Should you have different namespaces or different record names? > > Thanks, > > Jesse > > On Fri, Oct 12, 2018 at 3:16 PM Doug Cutting wrote: > >> Jesse, >> >> The record names sho

Re: Alias Issue

2018-10-12 Thread Doug Cutting
Jesse, The record names should match (although Java has been loose about enforcement of that). Also, that should be "aliases", not "alias". What happens if you add: "aliases": ["SimpleCard"] to the second schema, and change the field alias to: "aliases": ["suit"] ? Alternately, you could

Re: What is the default value I can specify for bytes when it is in union?

2018-06-06 Thread Doug Cutting
The default for a union is interpreted as the type of its first branch. https://avro.apache.org/docs/current/spec.html#Unions Thus, in your example, if you want a nullable byte array, place "null" first in the union, e.g.: {"type":"record","name":"hello","fields":[{"name":"id","type":["null",

Re: Read schema from avro file without reading entire file

2018-05-03 Thread Doug Cutting
You might instead try using the blob's reader method? Something like: InputStream input = Channels.newInputStream(blob.reader()); try { return new DataFileStream(input, new GenericDatumReader()).getSchema(); } finally { input.close(); } Doug On Wed, May 2, 2018 at 4:30 PM Rodrigo Ipince

Re: JSON format change while converting from Avro Pojo

2018-04-25 Thread Doug Cutting
To preserve type information, Avro's json encoding tags union values with their type. https://avro.apache.org/docs/current/spec.html#json_encoding If you wish to avoid this tagging, then you may use toString() on Avro data. This will generate valid Json, although some type information may be

Re: GenericData.deepCopy() HotSpot

2018-02-05 Thread Doug Cutting
.valueOf(new Random().nextInt())); > user.setFavoriteColor("blue" +new Random().nextFloat()); > user.setData(ByteBuffer.wrap(new byte[15000])); > dataFileWriter.append(user); > } > dataFileWriter.close(); > stopwatch.stop(); > long elapsedTime = stopwatch

Re: GenericData.deepCopy() HotSpot

2018-01-29 Thread Doug Cutting
Builders have some inherent overheads. Things could be optimized to better minimize this, but it will likely always be faster to reuse a single instance when writing. The deepCopy's are probably of the default values of each field you're not setting. If you're only setting a few fields then you

Re: Concurrent Building of Avro Objects

2018-01-23 Thread Doug Cutting
This sounds like AVRO-1760, fixed since Avro 1.8.0. https://issues.apache.org/jira/browse/AVRO-1760 What version of Avro are you using? Doug On Mon, Jan 22, 2018 at 9:45 AM, Nishanth S wrote: > Hi All, > > We have a process that reads data from a local file share

Re: How to override hashCode/equals on java classes generated by avro.

2017-11-29 Thread Doug Cutting
If you specify "order" : "ignore" in a field then it won't be considered by hashCode or equals. Processing also won't descend into the value of that field, speeding things up. Doug On Nov 29, 2017 1:15 AM, "Tushar Gosavi" wrote: > Hi All, > > I am creating java objects using

Re: Java Class Generation Drops Underscore in field Name

2017-08-28 Thread Doug Cutting
This sounds like a bug. There is logic in SpecificCompiler#generateMethodName to try to avoid name conflicts, but it's clearly faulty in this case. One could add a flag that inhibited underscore removal, which would help here. What might be better would be have a table of method names that have

Re: appending to Object Container Files

2017-08-25 Thread Doug Cutting
The Avro file format supports appends. However some filesystems (e.g., HDFS and S3) may not support append, so applications generally avoid depending on it. Also, it can complicate application semantics when the contents of files change after they are first created. The Java API supports

Re: Typo in the docs?

2017-07-17 Thread Doug Cutting
Aliases are only applicable to named types and fields. https://avro.apache.org/docs/current/spec.html#Aliases Named types are records, enums and fixed: https://avro.apache.org/docs/current/spec.html#names The fixed documentation that says, "supports two attributes" is indeed confusing. We

Re: Parsing canonical forms with schemas having default values.

2017-06-07 Thread Doug Cutting
When reading data, two schemas are used: a schema with the same fingerprint as used to write the data, typically the actual schema used to write, and the schema you'd like to project to. Default values are only used from the latter schema. Matching fingerprints indicate binary compatibility.

Re: avro-tools not serialising multibyte chars today

2017-03-17 Thread Doug Cutting
Maybe your JVM's default charset has changed? Try -Dfile.encoding="UTF-8" when you start Java. Even if that fixes things, it's perhaps still a bug. The tool should probably not depend on the default charset, but should explicitly set its expected input encoding. So, if that's the problem,

Re: Implementation of compatibility rules

2017-02-22 Thread Doug Cutting
Support for aliases should be easy to add by calling Schema#applyAliases before the compatibility check. Whether aliases should be applied depends on whether the compatibility check is meant to be valid only for implementations that support aliases or also ones that do not. Note that support for

Re: java.math.BigDecimal to Avro .avdl file help please

2017-02-16 Thread Doug Cutting
I believe this has already been implemented but not yet released. It was implemented in: https://issues.apache.org/jira/browse/AVRO-1847 This is slated to be included in the 1.8.2 release, which should soon be out. Doug On Wed, Feb 15, 2017 at 6:50 PM, Steve Sun wrote:

Re: avro maven generated code doesn't implement serialize

2016-06-06 Thread Doug Cutting
SpecificRecordBase implements Serializable. Doug On Mon, Jun 6, 2016 at 11:27 AM, Giri P wrote: > Hi, > > When I generate the code using avro maven plugin it doesn't implement > serilalizable. > > public class Attributed extends > org.apache.avro.specific.SpecificRecordBase

Re: Not able to resolve union for array type

2016-06-03 Thread Doug Cutting
Your schema permits null or an array of records. I suspect you want an array containing nulls or records, e.g., {"type":"array","items":["null",{"type":"record"... On Jun 3, 2016 5:54 PM, "Giri P" wrote: > Hi, > > I'm getting below error when I try to insert null into union

Re: OutOfMemoryError while writing a large map to a file

2016-01-28 Thread Doug Cutting
On Thu, Jan 28, 2016 at 7:51 AM, David Kincaid wrote: > Does anyone have a suggestion for making the write not try to copy the data > in memory as it's writing? BlockingBinaryEncoder is meant to do this, but I don't know how much it's been used.

Re: Adding new field with default value to an Avro schema

2015-02-03 Thread Doug Cutting
On Tue, Feb 3, 2015 at 9:34 AM, Lukas Steiblys lu...@doubledutch.me wrote: On a related note, is there a tool that can check the backwards compatibility of schemas? https://avro.apache.org/docs/current/api/java/org/apache/avro/SchemaCompatibility.html Doug

Re: Feature: Clear all fields / Reset all fields to default value on Record template

2015-01-07 Thread Doug Cutting
On Tue, Jan 6, 2015 at 1:33 PM, Maulik Gandhi mmg...@gmail.com wrote: I was wondering if adding a functionality of clearing all fields on Record, makes sense or not? I was wondering if adding a functionality of reseting all fields to default value (the default value would be what has been

Re: Exception No protocol name specified for Record

2015-01-07 Thread Doug Cutting
You're calling GoraCompiler#main, which inherits SpecificCompiler#main, which unconditionally calls compileSchema. You could wrap your record schema in {protocol:X, ..., types: [ schema ] } so this works. Or you might use SpecificCompilerTool, which supports compiling both schemas and protocols

Re: Avro schema and data read with it.

2014-12-17 Thread Doug Cutting
Avro skips over fields that were present in the writer's schema but are no longer present in the reader's schema. Skipping is substantially faster than reading for most types. For known-size types like string, bytes, fixed, double and float the file pointer can be incremented past skipped

Re: Reading huge records

2014-12-16 Thread Doug Cutting
Avro does permit partial reading of arrays. Arrays are written as a series of length-prefixed blocks: http://avro.apache.org/docs/current/spec.html#binary_encode_complex The standard encoders do not write arrays as multiple blocks, but BlockingBinaryEncoder does. It can be used with any

Re: GenericData union validation - IndexOutOfBoundsException

2014-12-16 Thread Doug Cutting
Yes, this looks like a bug in GenericData#validate(). It should use GenericData#resolveUnion(). Please file an issue in Jira. If you are able, attach a patch that includes a test. Thanks, Doug On Tue, Dec 9, 2014 at 12:30 PM, Jeffrey Mullins (BLOOMBERG/ BOSTON) jmullin...@bloomberg.net

Re: How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?

2014-12-15 Thread Doug Cutting
Avro's JSON encoding requires that non-null union values be tagged with their intended type. This is because unions like [bytes,string] and [int,long] are ambiguous in JSON, the first are both encoded as JSON strings, while the second are both encoded as JSON numbers.

Re: How to bind several Responder in single port

2014-12-15 Thread Doug Cutting
On Mon, Dec 15, 2014 at 12:21 AM, 聂琨琳 nie...@126.com wrote: Is there any way to bind several Responder in single port? That's not currently supported. You could programatically create a protocol with the union of the types and messages of several protocols and serve that with a single

Re: Building skip table for Avro data

2014-12-08 Thread Doug Cutting
On Thu, Dec 4, 2014 at 8:05 PM, Ken Krugler kkrugler_li...@transpac.com wrote: a. how is it sorted lexicographically (as per the SortedKeyValueFile JavaDocs)? The key/value pairs are sorted by their key schema, as per Avro's order specification:

Re: avdl schema compatibility

2014-10-27 Thread Doug Cutting
The type of the objfsptr parameter looks correct to me. Rather than a reference to the name of the record it has the record's definition. What looks incorrect about this to you? That said, I have no idea what's causing that error. Doug On Sun, Oct 26, 2014 at 12:20 PM, Camp, Jonathan

Re: Converting POJO to Generic or Indexed Records

2014-10-24 Thread Doug Cutting
On Tue, Oct 21, 2014 at 10:11 AM, umahida urvish.mah...@gmail.com wrote: I have complicated nested POJOs. I am able to generate the schema using ReflectData tool. I want to convert them to Generic Record preferably or Indexed Records. To convert from a reflect representation to generic you

Re: invalid file with avro tools random and tojson

2014-10-24 Thread Doug Cutting
This is a bug. It works if you change the command to not use standard output, e.g.: random --schema-file schema.avsc --count 20 test.avro The problem is that TestUtil.java prints something to standard output that should go to standard error. I filed an issue in Jira for this and will post a

Re: Extracting records as bytes from avro file

2014-10-15 Thread Doug Cutting
There is no method to read individual records as binary since they're not delimited nor length-prefixed. Instead you can get a block of records as binary and the count of records in the block and pass that to a deserializer that parses individual records. Doug On Wed, Oct 15, 2014 at 4:47 AM,

Re: Adding Conclusion to Avro Spec

2014-09-29 Thread Doug Cutting
On Sat, Sep 27, 2014 at 9:20 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Is there a possibility that someone can add material on how static the specification documentation is Minor versions are meant to have back-compatible APIs and both back- and forward-compatible data formats.

Re: Avro and serializable interface

2014-08-28 Thread Doug Cutting
On Thu, Aug 28, 2014 at 4:22 AM, Casadio Phaedra phaedra.casa...@datamanagementpa.it wrote: Now the problem is that avro beans does not extend serializable interface. How can i solve the problem? if it’s possible… There is a request to change this. I will upload a patch there soon.

Re: Passively Converting Null Map to be valid

2014-08-27 Thread Doug Cutting
On Wed, Aug 27, 2014 at 11:08 AM, Micah Whitacre mkwhita...@gmail.com wrote: We are reading with a BufferedBinaryDecoder and using the new schema as both the written and reader schema because the written schema is not preserved with the payload so it is not easy to retrieve. My questions are:

Re: Avro multiplexing, availability of authentication info to services

2014-08-22 Thread Doug Cutting
On Fri, Aug 22, 2014 at 6:34 AM, Sam Lawrance s...@illumynite.com wrote: is there any way for the service implementation to obtain information about the calling client, such as the hostname or some form of context from SASL? Transceiver#getRemoteName() returns the remote host name, but this

Re: State of the C++ vs Java implementations

2014-08-15 Thread Doug Cutting
On Thu, Aug 14, 2014 at 1:03 PM, John Lilley john.lil...@redpoint.net wrote: Do you know where I can find a list of codecs supported in Java vs C++? Grepping the Avro C++ headers, it seems to support just the null codec and deflate. These are the two codecs that every implementation is meant

Re: State of the C++ vs Java implementations

2014-08-14 Thread Doug Cutting
On Thu, Aug 14, 2014 at 11:56 AM, John Lilley john.lil...@redpoint.net wrote: I’m seeing discussion of a new Decimal encoding in the mailing list, and it would be bad for us to commit to the C++ Avro, and then find that our customers have created Avro files (using Java, MapReduce, etc) that we

Re: Avro and Jackson dependency

2014-08-08 Thread Doug Cutting
Avro currently requires Jackson 1.x. Jackson 2.x has an incompatible API in a different package. The two versions of Jackson do not conflict so both may be used within a single application. Whether when to eventually upgrade Avro to use Jackson 2.x is discussed in:

Re: Problem with cross-references (avro-maven-plugin)

2014-08-04 Thread Doug Cutting
I think it works on Windows because the directory there is listed in an order that happens to work, whereas in Linux the directory ordering doesn't happen to work. We could perhaps always sort directory listings alphabetically so this is deterministic, but requiring folks to create schema names

Re: How to deserialize avro file with union/many schemas?

2014-07-24 Thread Doug Cutting
On Thu, Jul 24, 2014 at 7:23 AM, Echo echo...@gmail.com wrote: The avro library can't read the file with that 'union' schema, so I wonder: With which Avro library can't you read a file with a union schema? Unions are a standard feature and every implementation should be able to read a file with

Re: How to use java-class with JSON schema?

2014-07-03 Thread Doug Cutting
The java-class attribute is supported by the reflect implementation, not by the code-generating specific implementation. So you could define Foo in Java with something like: public class Foo { private long batchId; @Stringable private Timestamp timestamp; public Foo() {} public Foo(long

Re: aliasing items in the default namespace

2014-06-30 Thread Doug Cutting
On Fri, Jun 27, 2014 at 5:01 PM, Josh Buffum jbuf...@gmail.com wrote: Is there a way for me to create a new record (with a namespace) using an old record name (with no namespace) as an alias? No, but this is easily remedied. I filed the following issue attached a fix:

Re: Schema Coersion

2014-06-27 Thread Doug Cutting
Aliases may help here. If you have a schema in one namespace and wish to read data written with a schema in another namespace, potentially with different field names even, then you can add aliases to your schema naming the other schema (and its fields, if needed). For example, if you want to

Re: Schema Coersion

2014-06-27 Thread Doug Cutting
On Fri, Jun 27, 2014 at 9:40 AM, Pritchard, Charles X. -ND charles.x.pritchard@disney.com wrote: is there a means to coerce data written as byte[] into String (and the other way around) ? Not easily today. Binary string values are a subset of bytes (those that are valid UTF-8 sequences),

Re: 1.7.6 Slow Deserialization

2014-06-23 Thread Doug Cutting
On Mon, Jun 16, 2014 at 7:59 AM, Han, Xiaodan xiaodan@baml.com wrote: Is GenericDatumReader thread safe? What about the writer. Yes, a DatumReader instance may be used in multiple threads. Encoder and Decoder are not thread-safe, but DatumReader and DatumWriter are. If we change the

Re: 1.7.6 Slow Deserialization

2014-06-13 Thread Doug Cutting
On Wed, Jun 11, 2014 at 1:05 PM, Han, Xiaodan xiaodan@baml.com wrote: org.apache.avro.specific.SpecificDatumReader.findStringClass(SpecificDatumReader.java:80) org.apache.avro.generic.GenericDatumReader.getStringClass(GenericDatumReader.java:394) The result of findStringClass are cached by

Re: Custom hashCode function

2014-05-28 Thread Doug Cutting
On Mon, May 26, 2014 at 3:43 AM, Han JU ju.han.fe...@gmail.com wrote: My question for the moment: is it possible to custom the hashCode function of a avro record? Say a record a a field `uid` and I'd like to return this value as the hashCode. If you're using Avro's reflect model then you can

Re: Schema import dependencies

2014-05-28 Thread Doug Cutting
, is there a standard API that load them both. Hrishikesh P mentions avro maven plugin. I mainly use the Python API so I am unfamiliar with this. Is a comparable API exist? I understand the IDL form has explicit linking of schema files. I will look into it next. Wai Yip Doug Cutting cutt

Re: Schema import dependencies

2014-05-28 Thread Doug Cutting
a data file they need to be merged into one standalone schema. The maven plugin does this. Otherwise we have to merge it ourselves. This is not too hard to merge. I just want make sure I'm not missing some exiting tool or API available. Wai Yip Doug Cutting cutt...@apache.org Wednesday

Re: Is it possible to write a magic byte in Avro file head?

2014-05-23 Thread Doug Cutting
need to be stored in each record. 2014-05-17 7:20 GMT+08:00 Doug Cutting cutt...@apache.org: This incompatibly alters the Avro file format. Could you perhaps instead add this into the Avro file's metadata? Doug On Thu, May 15, 2014 at 5:44 AM, Fengyun RAO raofeng...@gmail.com wrote

Re: Schema import dependencies

2014-05-22 Thread Doug Cutting
You might instead use Avro IDL to define your schemas. It permits you define multiple schemas in a single file, so that you can determine the order they're defined in. It also permits ordered inclusion of types from other files, both IDL files and schema files. Doug On Thu, May 22, 2014 at

Re: Is it possible to write a magic byte in Avro file head?

2014-05-16 Thread Doug Cutting
This incompatibly alters the Avro file format. Could you perhaps instead add this into the Avro file's metadata? Doug On Thu, May 15, 2014 at 5:44 AM, Fengyun RAO raofeng...@gmail.com wrote: I have a cache file using Avro serialization, and I want to add a magic byte indicating cache version

Re: Avro cycle support

2014-05-16 Thread Doug Cutting
On Wed, May 14, 2014 at 10:38 AM, Bernhard Damberger bdamber...@walmartlabs.com wrote: 1. Is there a plan to add cycle support to Avro? I saw this ticket: https://issues.apache.org/jira/browse/AVRO-695. But it hasn't been worked on since 1/2011. 2. Why does Avro handle cyclical

Re: How to deserialize java resource file in the jar package?

2014-05-02 Thread Doug Cutting
On Tue, Apr 29, 2014 at 10:43 PM, Fengyun RAO raofeng...@gmail.com wrote: Could there also be an API requiring only a Stream object in java, which I think would be quite convenient. Please see DataFileStream, which can be constructed using an InputStream. This is the base class for

Re: Go library

2014-03-20 Thread Doug Cutting
I have not heard of any work on an implementation of Avro in go. It would make a great addition, even if only data file support. Doug On Sat, Mar 15, 2014 at 5:59 AM, Mike Stanley m...@mikestanley.org wrote: Anyone know of any avro libraries for go? I haven't had much luck finding anything.

Re: Problem while Converting from JSON=Avro=JSON

2014-03-14 Thread Doug Cutting
To generate a file with a subset of fields you can specify a 'reader' schema that contains only the desired fields. For example, if you have a schema like: {type:record,name:Event,fields:[ {name:id,type:int}, {name:url,type:string},

Re: what's the efficiency difference between type: string and [string, null]

2014-03-14 Thread Doug Cutting
One small note: the best practice is to place null first when it's in a union. This is because the type of a default value for a union field is the type of the first element of the union, and null is the most commonly used default value for unions with null. So the idiom for a field that

Re: Question about the state of the AVRO C# (csharp) implementation

2014-03-13 Thread Doug Cutting
On Thu, Mar 13, 2014 at 7:34 AM, Longton, Nigel nigel.long...@pimco.com wrote: Are there plans to make C# feature equivalent? As a volunteer-based open-source project, we don't have a long-term plan. Rather we, as a group, consider contributions as they arrive and generally accept those that

Re: How to specify the CSharp class name of a field

2014-03-13 Thread Doug Cutting
On Thu, Mar 13, 2014 at 7:59 AM, Longton, Nigel nigel.long...@pimco.com wrote: What is the equivalent to ‘java-class’ for the csharp code generator? There isn't an equivalent at this point. Specifically we’re looking at having fields of date and decimal type. You could perhaps interpret the

Re: Avro 1.7.6 Enums with fields

2014-03-07 Thread Doug Cutting
You can add attributes to the schema, e.g.: {type:enum, name:MyEnum, symbols:[FOO, BAR, BAZ], indexes:[10,11,12], descriptions[foo is..., bar is.., baz is] } Then, if you generate specific code, you can access this with something like: int fooIndex =

Re: Enum backward compatibility in distributed services...

2014-02-27 Thread Doug Cutting
On Tue, Jan 28, 2014 at 9:43 AM, Amihay Zer-Kavod amih...@gmail.com wrote: Bottom line, I would go with Flex approach and retire the Specific approach entirely. I filed an issue in Jira for this. https://issues.apache.org/jira/browse/AVRO-1468 Doug

Re: Create Avro from bytes, not by fields

2014-02-07 Thread Doug Cutting
You might use DataFileWriter#appendEncoded: http://avro.apache.org/docs/current/api/java/org/apache/avro/file/DataFileWriter.html#appendEncoded(java.nio.ByteBuffer) If the body has just single instance of the record then you'd call this once. If you have multiple instances then you might change

Re: Direct conversion from Generic Record to Specific Record

2014-02-06 Thread Doug Cutting
SpecificData#deepCopy will make this conversion. It currently fails for enums, but the fix is easy. Here's a patch that makes that fix and demonstrates a conversion. If this change is of interest, please file an issue in Jira. Doug Index:

Re: Facing issue while using avro-maven-plugin

2014-02-05 Thread Doug Cutting
The problem with deepCopy may be a mismatch between the version of Avro you're using at compile time and the version you're using at runtime. When exactly are you getting this error? Doug On Wed, Feb 5, 2014 at 9:32 AM, Christophe Taton christophe.ta...@gmail.com wrote: Hi, I do not know

Re: java.lang.ClassCastException: java.util.ArrayList$Itr cannot be cast to org.apache.avro.generic.IndexedRecord

2014-02-04 Thread Doug Cutting
It's hard to tell without more detail, but I strongly suspect this is a Camel issue rather than a purely Avro issue. It might thus better be asked on the Camel user mailing list. https://camel.apache.org/mailing-lists.html Doug On Tue, Feb 4, 2014 at 8:54 AM, Kostas Margaritis

Re: Enum backward compatibility in distributed services...

2014-01-27 Thread Doug Cutting
You'd like the compile-time type-checking of specific, but the run-time flexibility of generic, right? Here's a way we might achieve this. Given the following schemas: {type:enum, name:Color, symbols:[RED, GREEN, BLUE]} {type:record, name:Shape, fields:[ {name:xPosition, type:int},

Re: [ANNOUNCE] Avro release 1.7.6

2014-01-27 Thread Doug Cutting
On Sat, Jan 25, 2014 at 8:58 PM, Christophe Taton christophe.ta...@gmail.com wrote: Is it also possible to push the Python3 version to PyPi? I just did that. https://pypi.python.org/pypi/avro-python3/1.7.6 I renamed it avro-python3. Does that seem reasonable? It seems like one of several

Re: Schema exclusion from Avro message

2014-01-27 Thread Doug Cutting
If you're using Avro's RPC mechanism, schemas are only sent when the client and server do not already have each other's schema. Each client request is preceded by a hash of the clients schema and the schema it thinks the server is using. If the server already has the client's schema, and the

[ANNOUNCE] Avro release 1.7.6

2014-01-24 Thread Doug Cutting
I'd like to announce the availability of Avro release 1.7.6. Changes are listed at: http://s.apache.org/avro176 This release can be downloaded from: https://www.apache.org/dyn/closer.cgi/avro/ Java jar files are available from Maven Central. Ruby artifacts are at RubyGems. Python is at

Re: upgrading to Avro 1.7.5

2014-01-23 Thread Doug Cutting
As others have noted, the data format has not changed and should be compatible. The performance difference you note is curious. I wonder if this could be related to the following issue? https://issues.apache.org/jira/browse/AVRO-1348 You could try the 1.7.6 release candidate to test this:

Re: Avro C++ RPC support

2014-01-23 Thread Doug Cutting
On Thu, Jan 16, 2014 at 7:34 AM, Sanket Satyaki sanketsaty...@gmail.com wrote: I wanted to check, if there is any ongoing request for providing the RPC support in C++? There was some work on this a few years ago, but it was never completed. https://issues.apache.org/jira/browse/AVRO-484

Re: Nullable Fields

2014-01-21 Thread Doug Cutting
On Mon, Jan 20, 2014 at 3:50 AM, Alparslan Avcı alparslan.a...@agmlab.com wrote: {name: field1, type: [null, {type:map, values:[null, string]}],default:null} can be represented as like {name: field1, type: {type:map, values:string, nullable:true}, nullable:true, default:null} You'd like an

Re: Avro Array and combining schema ?

2014-01-21 Thread Doug Cutting
It sounds like you'd like to reference schemas in other files? If so, you can do this with Maven using the import configuration, added in https://issues.apache.org/jira/browse/AVRO-1188. Also the command-line compiler will process files in order, with types defined earlier on the command line

Re: Handling field names when serializing and deserializing JSON

2014-01-14 Thread Doug Cutting
On Tue, Jan 14, 2014 at 2:06 PM, Pritchard, Charles X. -ND charles.x.pritchard@disney.com wrote: Do I just pop the “original” field name in as an alias and use the “safe” (alphanumeric+underscore) one as the primary name? I'm not exactly sure what you're trying to do. Aliases in the

Re: Handling field names when serializing and deserializing JSON

2014-01-14 Thread Doug Cutting
for an extensive discussion. Doug On Tue, Jan 14, 2014 at 2:45 PM, Pritchard, Charles X. -ND charles.x.pritchard@disney.com wrote: On Jan 14, 2014, at 2:32 PM, Doug Cutting cutt...@apache.org wrote: On Tue, Jan 14, 2014 at 2:06 PM, Pritchard, Charles X. -ND charles.x.pritchard

Re: Avro + SSL ?

2014-01-10 Thread Doug Cutting
On Thu, Jan 9, 2014 at 10:02 PM, Sid Shetye sid...@outlook.com wrote: Does anyone know how to use Avro RPC along with SSL? The current C# implementation only has a SocketTransceiver implementation, which does not support secure connections. If you want to interoperate with Java and other

Re: Avro Read with sync() {java.io.IOException: Invalid sync}

2013-12-23 Thread Doug Cutting
This sounds like a bug. I wonder if it is similar to a related bug in Hadoop? https://issues.apache.org/jira/browse/HADOOP-9307 If so, please file an issue in Jira. Doug On Sat, Dec 21, 2013 at 4:35 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: Hello, I have a 340 MB avro data file that

Re: Effort towards Avro 2.0?

2013-12-03 Thread Doug Cutting
On Mon, Dec 2, 2013 at 1:56 PM, Philip Zeyliger phi...@cloudera.com wrote: It sounds like you're proposing to break language API compatibility. Are you also proposing to break wire compatibility for Avro HTTP RPC, Avro Netty RPC, and/or Avro datafiles? We should be able to provide

Re: Correct avsc definition for array of external object's

2013-11-08 Thread Doug Cutting
On Fri, Nov 8, 2013 at 12:19 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: The question I am asking is whether I should embed the ExtractorSpec as a nested record? Or is there another way such as importing? Nesting would certainly work, but may make things harder to maintain if you

Re: GenericDatumReader and datum reuse

2013-10-31 Thread Doug Cutting
A simple approach to reuse is to pass the value previously returned from read(): GenericRecord record = null; while (...) { record = reader.read(record, decoder); ... code that does not retain a pointer to record ... } Doug On Wed, Oct 30, 2013 at 3:07 PM, kulkarni.swar...@gmail.com

Re: Partial lookup without full deserialization

2013-10-31 Thread Doug Cutting
You can specify a reader schema of simply {a:int}. Avro will efficiently skip missing fields when parsing values. Note that you still need the original, full schema (the writer schema). This is achieved through the schema resolution rules.

Re: Setting field default value's programmatically

2013-10-27 Thread Doug Cutting
On Sat, Oct 26, 2013 at 3:41 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: 2) Why org.apache.avro.data.RecordBuilderBase#defaultValue(Field field) is not returning null if no default is specified in the schema... which is the case? It returns null when null is the default value,

Re: expect specific record but get generic

2013-10-21 Thread Doug Cutting
If the generated classes are not on the classpath then the generic representation is used. So, yes, this sounds like a classpath problem. On Mon, Oct 21, 2013 at 8:41 AM, Koert Kuipers ko...@tresata.com wrote: i am observing that on a particular system (spark) my code breaks in that avro does

Re: expect specific record but get generic

2013-10-21 Thread Doug Cutting
On Mon, Oct 21, 2013 at 1:19 PM, Koert Kuipers ko...@tresata.com wrote: doug, could it be a classloader (instead of classpath) issue? looking at spark it seems to run the tasks inside the slaves/workers with a custom classloader. Yes, it could be a classloader issue. Perhaps you need to pass

Re: default values

2013-10-20 Thread Doug Cutting
Note that builders do supply defaults. http://avro.apache.org/docs/current/api/java/org/apache/avro/data/RecordBuilder.html But that might not help you much here. You might intersect your config json with the reader schema to determine its implied writer schema. Doug On Oct 13, 2013 9:21 PM,

Re: avrogencpp and multiple .avsc

2013-10-15 Thread Doug Cutting
On Tue, Oct 15, 2013 at 3:21 PM, William McKenzie wsmck...@cartewright.com wrote: Is it possible to generate code for multiple schemas at one time, and resolve references between them? The command line 'compile' tool and maven task both support this. One can pass multiple schema files on the

Re: GenericRecord and passivity

2013-10-15 Thread Doug Cutting
GenericRecord should work well in this context. Can you provide a complete example that fails? Doug On Tue, Oct 15, 2013 at 3:43 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Do we know if a GenericRecord is robust to schema evolution? I am currently seeing cases where I

Re: GenericRecord and passivity

2013-10-15 Thread Doug Cutting
On Tue, Oct 15, 2013 at 4:49 PM, Eric Wasserman ewasser...@247-inc.com wrote: Change this line: DatumReaderGenericRecord reader = new GenericDatumReaderGenericRecord(schema_11); to this: DatumReaderGenericRecord reader = new GenericDatumReaderGenericRecord(schema_10, schema_11); Yes, that

Re: Unable to compile a namespace-less schema

2013-10-11 Thread Doug Cutting
, it cannot import the classes generated from the namespace-less schema. just run mvn:compile to get the compilation errors Thanks, Vitaly On Thu, Oct 10, 2013 at 1:58 PM, Doug Cutting cutt...@apache.org wrote: I encourage you to please provide a complete test, code that fails. If maven is involved

Re: Unable to compile a namespace-less schema

2013-10-10 Thread Doug Cutting
from it. Is there any way to read specific records when the schema that was used to write them contains no namespace? Thanks, Vitaly On Wed, Oct 9, 2013 at 6:07 PM, Doug Cutting cutt...@apache.org wrote: Using the current trunk of Avro I am able to: - extract the schema from the data

Re: footer info in avro

2013-10-09 Thread Doug Cutting
There's no plan I know of to add this. Avro's original file format wrote metadata at the end of the file. This was changed in Avro 1.3 so that files could always be processed sequentially, without seeking to the end. Doug On Wed, Oct 9, 2013 at 5:08 PM, Venkat vramac...@ymail.com wrote: Hi All

Re: Unable to compile a namespace-less schema

2013-10-09 Thread Doug Cutting
Using the current trunk of Avro I am able to: - extract the schema from the data file you provided (using avro-tools schema command) - generate Java classes for this schema (using the avro-tools compile command) - compile these generated Java classes (using the javac command) Can you provide a

  1   2   3   4   >