Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1543959178 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,747 @@ +# Cross language object graph serialization + +> Format Version History: +> - Version 0.1 - serialization spec formalized + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. Mutable types such as `list/map/set/array/tensor/arrow` are not allowed as key of map. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provi
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1543944897 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,747 @@ +# Cross language object graph serialization + +> Format Version History: +> - Version 0.1 - serialization spec formalized + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. Mutable types such as `list/map/set/array/tensor/arrow` are not allowed as key of map. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provi
(incubator-fury) branch main updated: chore(java): Reuse unsafePutPositiveVarInt in unsafeWritePositiveVarInt (#1434)
This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/incubator-fury.git The following commit(s) were added to refs/heads/main by this push: new 7d64ede5 chore(java): Reuse unsafePutPositiveVarInt in unsafeWritePositiveVarInt (#1434) 7d64ede5 is described below commit 7d64ede53158d648befb3fec18465e96b9d89ca2 Author: LiangliangSui <116876207+liangliang...@users.noreply.github.com> AuthorDate: Fri Mar 29 01:13:01 2024 +0800 chore(java): Reuse unsafePutPositiveVarInt in unsafeWritePositiveVarInt (#1434) N/A Signed-off-by: LiangliangSui --- .../java/org/apache/fury/memory/MemoryBuffer.java | 46 ++ 1 file changed, 3 insertions(+), 43 deletions(-) diff --git a/java/fury-core/src/main/java/org/apache/fury/memory/MemoryBuffer.java b/java/fury-core/src/main/java/org/apache/fury/memory/MemoryBuffer.java index 11ca99a9..748aea14 100644 --- a/java/fury-core/src/main/java/org/apache/fury/memory/MemoryBuffer.java +++ b/java/fury-core/src/main/java/org/apache/fury/memory/MemoryBuffer.java @@ -1233,49 +1233,9 @@ public final class MemoryBuffer { * to avoid using two memory operations. */ public int unsafeWritePositiveVarInt(int v) { -// The encoding algorithm are based on kryo UnsafeMemoryOutput.writeVarInt -// varint are written using little endian byte order. -// This version should have better performance since it remove an index update. -long value = v; -final int writerIndex = this.writerIndex; -long varInt = (value & 0x7F); -value >>>= 7; -if (value == 0) { - UNSAFE.putByte(heapMemory, address + writerIndex, (byte) varInt); - this.writerIndex = writerIndex + 1; - return 1; -} -// bit 8 `set` indicates have next data bytes. -varInt |= 0x80; -varInt |= ((value & 0x7F) << 8); -value >>>= 7; -if (value == 0) { - unsafePutInt(writerIndex, (int) varInt); - this.writerIndex = writerIndex + 2; - return 2; -} -varInt |= (0x80 << 8); -varInt |= ((value & 0x7F) << 16); -value >>>= 7; -if (value == 0) { - unsafePutInt(writerIndex, (int) varInt); - this.writerIndex = writerIndex + 3; - return 3; -} -varInt |= (0x80 << 16); -varInt |= ((value & 0x7F) << 24); -value >>>= 7; -if (value == 0) { - unsafePutInt(writerIndex, (int) varInt); - this.writerIndex = writerIndex + 4; - return 4; -} -varInt |= (0x80L << 24); -varInt |= ((value & 0x7F) << 32); -varInt &= 0xFL; -unsafePutLong(writerIndex, varInt); -this.writerIndex = writerIndex + 5; -return 5; +int varintBytes = unsafePutPositiveVarInt(writerIndex, v); +writerIndex += varintBytes; +return varintBytes; } /** - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] chore(java): Reuse unsafePutPositiveVarInt in unsafeWritePositiveVarInt [incubator-fury]
chaokunyang merged PR #1434: URL: https://github.com/apache/incubator-fury/pull/1434 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[PR] chore(java): Reuse unsafePutPositiveVarInt in unsafeWritePositiveVarInt [incubator-fury]
LiangliangSui opened a new pull request, #1434: URL: https://github.com/apache/incubator-fury/pull/1434 N/A -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
LiangliangSui commented on PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#issuecomment-2025521000 > > > > > One more thing is that do we need to add a int16 magic number at the header? JDK serialization use two bytes, avro use 4 bytes, kryo/protobuf/flatbuffer doesn't add magic number. @theweipeng @PragmaTwice @LiangliangSui @pjfanning > > > > > > > > > > > > I think it would be necessary, we can use the magic number to indicate the protocol version > > > > > > > > > There are 4 bits empty in fury header, which can be used to indicate the protocol version too. > > > > > > I have noticed that sometimes, applications may encapsulate the data buffer using a proprietary protocol. On the receiving end, the use of a magic number is helpful for them to identify which protocol is being used. > > Added two bytes header to the spec Use two bytes to store the magic number, and use two bytes to store the protocol version, just like [JDK serialization](https://docs.oracle.com/javase/8/docs/api/java/io/ObjectStreamConstants.html#STREAM_MAGIC) (maybe we can also use one byte to store the protocol version), we still keep 4 empty bits in the header(after all, the number of version numbers that can be represented by 4 bits is relatively small.) , leaving room for future expansion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#issuecomment-2025195417 > > > > One more thing is that do we need to add a int16 magic number at the header? JDK serialization use two bytes, avro use 4 bytes, kryo/protobuf/flatbuffer doesn't add magic number. @theweipeng @PragmaTwice @LiangliangSui @pjfanning > > > > > > > > > I think it would be necessary, we can use the magic number to indicate the protocol version > > > > > > There are 4 bits empty in fury header, which can be used to indicate the protocol version too. > > I have noticed that sometimes, applications may encapsulate the data buffer using a proprietary protocol. On the receiving end, the use of a magic number is helpful for them to identify which protocol is being used. Added two bytes header to the spec -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542915697 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,692 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. Mutable types such as `list/map/set/array/tensor/arrow` are not allowed as key of map. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#issuecomment-2025070621 > > > One more thing is that do we need to add a int16 magic number at the header? JDK serialization use two bytes, avro use 4 bytes, kryo/protobuf/flatbuffer doesn't add magic number. @theweipeng @PragmaTwice @LiangliangSui @pjfanning > > > > > > I think it would be necessary, we can use the magic number to indicate the protocol version > > There are 4 bits empty in fury header, which can be used to indicate the protocol version too. I have noticed that sometimes, applications may encapsulate the data buffer using a proprietary protocol. On the receiving end, the use of a magic number is helpful for them to identify which protocol is being used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542845355 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,692 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. Mutable types such as `list/map/set/array/tensor/arrow` are not allowed as key of map. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use s
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542829185 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,692 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. Mutable types such as `list/map/set/array/tensor/arrow` are not allowed as key of map. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#issuecomment-2025033866 > > One more thing is that do we need to add a int16 magic number at the header? JDK serialization use two bytes, avro use 4 bytes, kryo/protobuf/flatbuffer doesn't add magic number. @theweipeng @PragmaTwice @LiangliangSui @pjfanning > > I think it would be necessary, we can use the magic number to indicate the protocol version There are 4 bits empty in fury header, which can be used to indicate the protocol version too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542823938 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,671 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use struct tag. +- python: use typehint. +- rust: use macro. + +### Type ID + +All internal
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#issuecomment-2025023227 > One more thing is that do we need to add a int16 magic number at the header? JDK serialization use two bytes, avro use 4 bytes, kryo/protobuf/flatbuffer doesn't add magic number. @theweipeng @PragmaTwice @LiangliangSui @pjfanning I think it would be necessary, we can use the magic number to indicate the protocol version -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542798344 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,692 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. Mutable types such as `list/map/set/array/tensor/arrow` are not allowed as key of map. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use s
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#issuecomment-2025006408 One more thing is that do we need to add a int16 magic number at the header? JDK serialization use two bytes, avro use 4 bytes, kryo/protobuf/flatbuffer doesn't add magic number. @theweipeng @PragmaTwice @LiangliangSui @pjfanning -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[I] Execute an error in the openj9 environment [incubator-fury]
mintonzhang opened a new issue, #1433: URL: https://github.com/apache/incubator-fury/issues/1433 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-fury/issues) and found no similar issues. ### Version fury-core: 0.4.0 openjdk version "17.0.10" 2024-01-16 IBM Semeru Runtime Open Edition 17.0.10.0 (build 17.0.10+7) Eclipse OpenJ9 VM 17.0.10.0 (build openj9-0.43.0, JRE 17 Mac OS X amd64-64-Bit Compressed References 20240116_636 (JIT enabled, AOT enabled) OpenJ9 - 2c3d78b48 OMR - ea8124dbc JCL - 2aad089841f based on jdk-17.0.10+7) ### Component(s) Java ### Minimal reproduce step ```java @Getter @Setter public static class UserCacheInfoTest { private String extraUserId; private String unionId; private String name; private String jobNumber; private String email; private Long userId; private String mobile; private String avatar; private String position; private String deptName; } public static void main(String[] args) { UserCacheInfoTest userCacheInfoTest = new UserCacheInfoTest(); userCacheInfoTest.setExtraUserId("1"); userCacheInfoTest.setUnionId("1"); userCacheInfoTest.setName("1"); userCacheInfoTest.setJobNumber("1"); userCacheInfoTest.setEmail("1"); userCacheInfoTest.setUserId(0L); userCacheInfoTest.setMobile("1"); userCacheInfoTest.setAvatar("1"); userCacheInfoTest.setPosition("1"); userCacheInfoTest.setDeptName("1"); //ThreadLocalFury fury = new ThreadLocalFury(classLoader -> ); / Fury fury = Fury.builder() .withLanguage(Language.JAVA) .withCodegen(true) //.registerGuavaTypes(true) .withCompatibleMode(CompatibleMode.COMPATIBLE) .requireClassRegistration(false) .suppressClassRegistrationWarnings(true) .build(); byte[] serialize = fury.serialize(userCacheInfoTest); UserCacheInfoTest deserialize = (UserCacheInfoTest) fury.deserialize(serialize); System.out.println(); } ``` ### What did you expect to see? Normal serialization and deserialization ### What did you see instead? ``` Exception in thread "main" java.lang.RuntimeException: Create compatible serializer failed, class: class com.jasolar.todo.TestConfig$UserCacheInfoTest at io.fury.serializer.CodegenSerializer.loadCompatibleCodegenSerializer(CodegenSerializer.java:58) at io.fury.resolver.ClassResolver.lambda$getObjectSerializerClass$4(ClassResolver.java:976) at io.fury.builder.JITContext.registerSerializerJITCallback(JITContext.java:132) at io.fury.resolver.ClassResolver.getObjectSerializerClass(ClassResolver.java:971) at io.fury.resolver.ClassResolver.getSerializerClass(ClassResolver.java:894) at io.fury.resolver.ClassResolver.getSerializerClass(ClassResolver.java:791) at io.fury.resolver.ClassResolver.createSerializer(ClassResolver.java:1182) at io.fury.resolver.ClassResolver.getOrUpdateClassInfo(ClassResolver.java:1114) at io.fury.Fury.writeRef(Fury.java:342) at io.fury.Fury.write(Fury.java:319) at io.fury.Fury.serialize(Fury.java:255) at io.fury.Fury.serialize(Fury.java:208) at com.jasolar.todo.TestConfig.main(TestConfig.java:114) Caused by: io.fury.codegen.CodegenException: Compile error: com.jasolar.todo.TestConfig_UserCacheInfoTestFuryCompatibleCodec_1_-2072095631_213584418: /* 0001 */ package com.jasolar.todo; /* 0002 */ /* 0003 */ import java.util.List; /* 0004 */ import java.util.Map; /* 0005 */ import java.util.Set; /* 0006 */ import io.fury.Fury; /* 0007 */ import io.fury.memory.MemoryBuffer; /* 0008 */ import io.fury.resolver.NoRefResolver; /* 0009 */ import io.fury.resolver.ClassInfo; /* 0010 */ import io.fury.resolver.ClassInfoHolder; /* 0011 */ import io.fury.resolver.ClassResolver; /* 0012 */ import io.fury.builder.Generated; /* 0013 */ import io.fury.serializer.CodegenSerializer.LazyInitBeanSerializer; /* 0014 */ import io.fury.serializer.Serializers.EnumSerializer; /* 0015 */ import io.fury.serializer.Serializer; /* 0016 */ import io.fury.serializer.StringSerializer; /* 0017 */ import io.fury.serializer.ObjectSerializer; /* 0018 */ import io.fury.serializer.CompatibleSerializer; /* 0019 */ import io.fury.serializer.collection.AbstractCollectionSerializer; /* 0020 */ import io.fury.serializer.collection.AbstractMapSerializer; /* 0021 */
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
theweipeng commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542773581 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,671 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use struct tag. +- python: use typehint. +- rust: use macro. + +### Type ID + +All internal d
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542764740 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,671 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use struct tag. +- python: use typehint. +- rust: use macro. + +### Type ID + +All internal
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1538852696 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,657 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: a boolean value (true or false). +- int8: a 8-bit signed integer. +- int16: a 16-bit signed integer. +- int32: a 32-bit signed integer. +- var_int32: a 32-bit signed integer which use fury var_int32 encoding. +- fixed_int32: a 32-bit signed integer which use two's complement encoding. +- int64: a 64-bit signed integer. +- var_int64: a 64-bit signed integer which use fury PVL encoding. +- sli_int64: a 64-bit signed integer which use fury SLI encoding. +- fixed_int64: a 64-bit signed integer which use two's complement encoding. +- float16: a 16-bit floating point number. +- float32: a 32-bit floating point number. +- float64: a 64-bit floating point number including NaN and Infinity. +- string: a text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum. +- list: a sequence of objects. +- set: an unordered set of unique elements. +- map: a map of key-value pairs. +- time types: +- duration: an absolute length of time, independent of any calendar/timezone, as a count of nanoseconds. +- timestamp: a point in time, independent of any calendar/timezone, as a count of nanoseconds. The count is relative + to an epoch at UTC midnight on January 1, 1970. +- decimal: exact decimal value represented as an integer value in two's complement. +- binary: an variable-length array of bytes. +- array type: only allow numeric components. Other arrays will be taken as List. The implementation should support the + interoperability between array and list. +- array: multidimensional array which every sub-array can have different sizes but all have same type. +- bool_array: one dimensional int16 array. +- int16_array: one dimensional int16 array. +- int32_array: one dimensional int32 array. +- int64_array: one dimensional int64 array. +- float16_array: one dimensional half_float_16 array. +- float32_array: one dimensional float32 array. +- float64_array: one dimensional float64 array. +- tensor: a multidimensional dense array of fixed-size values such as a NumPy ndarray. +- sparse tensor: a multidimensional array whose elements are almost all zeros. +- arrow record batch: an arrow [record batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object. +- arrow table: an arrow [table](https://arrow.apache.org/docs/cpp/tables.html#tables) object. + +Note: + +- Unsigned int/long are not added here, since not every language support those types. + +### Type disambiguation + +Due to differences between type systems of languages, those types can't be mapped one-to-one between languages. When +deserializing, Fury use the target data structure type and the data type in the data jointly to determine how to +deserialize and populate the target data structure. For example: + +```java +class Foo { + int[] intArray; + Object[] objects; + List objectList; +} + +class Foo2 { + int[] intArray; + List objects; + List objectList; +} +``` + +`intArray` has an `int32_array` type. But both `objects` and `objectList` fields in the serialize data have `list` data +type. When deserializing, the implementation will create an `Object` array for `objects`, but create a `ArrayList` +for `objectList` to populate its elements. And the serialized data of `Foo` can be deserialized into `Foo2` too. + +Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use +annotation to provide such information. + +```java + +@TypeInfo(fieldsNullable = false, trackingRef = false, polymorphic = false) +class Foo { + @FieldInfo(trackingRef = false) + int[] intArray; + @FieldInfo(polymorphic = true) + Object object; + @FieldInfo(tagId = 1, nullable = true) + List objectList; +} +``` + +Such information can be provided in other languages too: + +- cpp: use macro and template. +- golang: use struct tag. +- python: use typehint. +- rust: use macro. + +### Type ID + +All internal
Re: [PR] feat(spec): standardizing fury cross-language serialization specification [incubator-fury]
chaokunyang commented on code in PR #1413: URL: https://github.com/apache/incubator-fury/pull/1413#discussion_r1542685465 ## docs/protocols/xlang_object_graph_spec.md: ## @@ -0,0 +1,612 @@ +# Cross language object graph serialization + +Fury xlang serialization is an automatic object serialization framework that supports reference and polymorphism. +Fury will convert an object from/to fury xlang serialization binary format. +Fury has two core concepts for xlang serialization: + +- **Fury xlang binary format** +- **Framework implemented in different languages to convert object to/from Fury xlang binary format** + +The serialization format is a dynamic binary format. The dynamics and reference/polymorphism support make Fury flexible, +much more easy to use, but +also introduce more complexities compared to static serialization frameworks. So the format will be more complex. + +## Type Systems + +### Data Types + +- bool: A boolean value (true or false). +- byte: An 8-bit signed integer. +- i16: A 16-bit signed integer. +- i32: A 32-bit signed integer. +- i64: A 64-bit signed integer. +- half-float: A 16-bit floating point number. +- float: A 32-bit floating point number. +- double: A 64-bit floating point number including NaN and Infinity. +- string: A text string encoded using Latin1/UTF16/UTF-8 encoding. +- enum: a data type consisting of a set of named values. Rust enum with non-predefined field values are not supported as + an enum +- list: A sequence of objects. +- set: An unordered set of unique elements. +- map: A map of key-value pairs. Review Comment: updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
svn commit: r68178 - in /dev/incubator/fury/0.5.0-rc2: ./ apache-fury-0.5.0-rc2-incubating-src.tar.gz apache-fury-0.5.0-rc2-incubating-src.tar.gz.asc apache-fury-0.5.0-rc2-incubating-src.tar.gz.sha512
Author: chaokunyang Date: Thu Mar 28 10:27:15 2024 New Revision: 68178 Log: add incubtaing to files Added: dev/incubator/fury/0.5.0-rc2/ dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz (with props) dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.asc dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.sha512 Added: dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz == Binary file - no diff available. Propchange: dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz -- svn:mime-type = application/octet-stream Added: dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.asc == --- dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.asc (added) +++ dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.asc Thu Mar 28 10:27:15 2024 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIzBAABCAAdFiEEHiza5MCK19aU0csTnXvo5F5YC6QFAmYFRcgACgkQnXvo5F5Y +C6TdWQ//e7RnABvH5xzHP+g590C9Htg38x69fzS5MbS96HQzsSfP77XbuxYIDfSO +Dr8wLNgT/w4zMK1wI7j+4ANH4Hq86LsV09ye1aTjjH2c+dTteQT+G160zKdl0pHx +6PCijG0ohvnZG6sEmGZQpaMbrtTnShWnXyvfURRFcuY5epcP5c3qp3G3oVzkfYLN +hA5biNGwpGsF/VLkCbb+ZO98jM9LDCAZBr5ij2ZoJ68OoZ26jZ8oWOSSQsrMPFgC +MsdCH1AMxZAYe84BT9PfetswBdC+m+KwbqH0+PKWIQ8OfcP2mUjyDeMTfU5++/lQ +RrguJYFvRTdKR6S9pCFXjngiTv4XZ7uV9EXRQ1UfoL1/un+gY6o+SY4i2MLihb4y +LdpPVif6RMtPgQBJSEAxQjA0V1RkivZKO+KUjJCpuCceVGoWF7EbSUFKjMUi9ZtM +yV91b6i76qdDKOrubvsCgADCr0UEyd/ixd0h16NbL9a5GabtKeT88NtyKRXvABKR +7BxiQ+RYr08BpEfwJH0YKY1sOAPFzHwxLbCs3ivWMhDCKmda9b6u0cvxLo7Rz6Hz +wLFM9FQAf2vo1HQTzk0WkZ138QCouL217GoE0uQXfCAophI01yEvbYYnoiYkj22D +LlReb3IIamNbp6QhS3QxK4B/wboHDT02koJ0JWfcQNFolQIkNBw= +=zUzT +-END PGP SIGNATURE- Added: dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.sha512 == --- dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.sha512 (added) +++ dev/incubator/fury/0.5.0-rc2/apache-fury-0.5.0-rc2-incubating-src.tar.gz.sha512 Thu Mar 28 10:27:15 2024 @@ -0,0 +1 @@ +e3101b2c7a603a690d7a45aab7eb33d713c645a38c4cdf9a2b49653a65fe5eebec84d6016c9c38352a3e8c852338d42b94d240fdcc4b3f78d30961237ae2cef4 apache-fury-0.5.0-rc2-incubating-src.tar.gz - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
svn commit: r68175 - in /dev/incubator/fury/0.5.0-rc2: ./ incubator-fury-0.5.0-rc2.tar.gz incubator-fury-0.5.0-rc2.tar.gz.asc incubator-fury-0.5.0-rc2.tar.gz.sha512
Author: chaokunyang Date: Thu Mar 28 10:02:03 2024 New Revision: 68175 Log: Prepare for 0.5.0-rc2 Added: dev/incubator/fury/0.5.0-rc2/ dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz (with props) dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.asc dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.sha512 Added: dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz == Binary file - no diff available. Propchange: dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz -- svn:mime-type = application/octet-stream Added: dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.asc == --- dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.asc (added) +++ dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.asc Thu Mar 28 10:02:03 2024 @@ -0,0 +1,16 @@ +-BEGIN PGP SIGNATURE- + +iQIzBAABCAAdFiEEHiza5MCK19aU0csTnXvo5F5YC6QFAmYFP5gACgkQnXvo5F5Y +C6TDdg/9EO+UE51WduSv3yCIvGTVgDeFp5a36KcEW14fmO5rDkbbB+Kt5/Icn+Su +0hWniLj/Is39TrMsRXcBG6+GmTNcx1zk3yrftsLWR9pYuYWOsT20OuVYzdNnvmz4 +14bdy5EQo/KNhvwryhwrccsz5Lz0bwl2JUIbmWA2rcCAYoVa5NXhCwlawc8CTPyB +y9jw+TxJiNR6+wsFBvxz8JhidjUX2MWZ7clGoNCFP75e8VuGVAjhmIghypYqqjw3 +k7r1Hth7zJu/8rhpSV+eFNBunaKq0OnwlbCTl0z3HQaLhp3GRZgt3t2qrlrjd7M/ +2YTJqlQjH7duXr/6Hlup+/PNQHQ25O66REmVvy+sdbH+ybGEIa775F9eudzIsEgc +vWcAyZfqygv+g/cvHPPObNzZVYP+dciOyjmwDc+38G3WoKvu2JYrVKEonacg3I31 +lwPDOQTon89gX6WAbQCD43jikGy6aMMz7ET8nqTuQxERwOmzHTSwaoM5HMcc1PFA +J5mJGeN9LyX7Pk4nTJz0NkOuWfbtJ0mZdgR4YRh0aXe4OSGP5lGBweLSqtlh9NnX +8a5C0ejMJ+hAxU+OGj8AL37YnYHyESeI+UPSmy83/KtRHEN3ITxOGkRgCAyELZQq +d3dm5LkMoY3qqFaT7BVpvot3dg4yNYYtrAMsPpOdbZKCIsf5n70= +=XZhM +-END PGP SIGNATURE- Added: dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.sha512 == --- dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.sha512 (added) +++ dev/incubator/fury/0.5.0-rc2/incubator-fury-0.5.0-rc2.tar.gz.sha512 Thu Mar 28 10:02:03 2024 @@ -0,0 +1 @@ +e3101b2c7a603a690d7a45aab7eb33d713c645a38c4cdf9a2b49653a65fe5eebec84d6016c9c38352a3e8c852338d42b94d240fdcc4b3f78d30961237ae2cef4 incubator-fury-0.5.0-rc2.tar.gz - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
(incubator-fury) 01/01: bump version to 0.5.0-rc2
This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to tag 0.5.0-rc2 in repository https://gitbox.apache.org/repos/asf/incubator-fury.git commit 3f30419ae08c07bb992e48b5ff07f1463130f1c2 Author: chaokunyang AuthorDate: Thu Mar 28 17:54:03 2024 +0800 bump version to 0.5.0-rc2 --- integration_tests/graalvm_tests/pom.xml | 2 +- integration_tests/jdk_compatibility_tests/pom.xml | 2 +- integration_tests/jpms_tests/pom.xml | 2 +- integration_tests/latest_jdk_tests/pom.xml| 2 +- java/benchmark/pom.xml| 2 +- java/fury-core/pom.xml| 2 +- java/fury-format/pom.xml | 2 +- java/fury-test-core/pom.xml | 2 +- java/fury-testsuite/pom.xml | 2 +- java/pom.xml | 2 +- javascript/packages/fury/package.json | 2 +- javascript/packages/hps/package.json | 2 +- python/pyfury/__init__.py | 2 +- rust/Cargo.toml | 2 +- scala/build.sbt | 2 +- 15 files changed, 15 insertions(+), 15 deletions(-) diff --git a/integration_tests/graalvm_tests/pom.xml b/integration_tests/graalvm_tests/pom.xml index 92f7dc2a..59e7b5ba 100644 --- a/integration_tests/graalvm_tests/pom.xml +++ b/integration_tests/graalvm_tests/pom.xml @@ -25,7 +25,7 @@ org.apache.fury fury-parent -0.5.0-SNAPSHOT +0.5.0-rc2 ../../java 4.0.0 diff --git a/integration_tests/jdk_compatibility_tests/pom.xml b/integration_tests/jdk_compatibility_tests/pom.xml index 33a42d80..a6983753 100644 --- a/integration_tests/jdk_compatibility_tests/pom.xml +++ b/integration_tests/jdk_compatibility_tests/pom.xml @@ -25,7 +25,7 @@ org.apache.fury fury-parent -0.5.0-SNAPSHOT +0.5.0-rc2 ../../java 4.0.0 diff --git a/integration_tests/jpms_tests/pom.xml b/integration_tests/jpms_tests/pom.xml index 74e1e029..6b24b4a9 100644 --- a/integration_tests/jpms_tests/pom.xml +++ b/integration_tests/jpms_tests/pom.xml @@ -25,7 +25,7 @@ org.apache.fury fury-parent -0.5.0-SNAPSHOT +0.5.0-rc2 ../../java 4.0.0 diff --git a/integration_tests/latest_jdk_tests/pom.xml b/integration_tests/latest_jdk_tests/pom.xml index dca2a9ad..0adfe477 100644 --- a/integration_tests/latest_jdk_tests/pom.xml +++ b/integration_tests/latest_jdk_tests/pom.xml @@ -25,7 +25,7 @@ org.apache.fury fury-parent -0.5.0-SNAPSHOT +0.5.0-rc2 ../../java 4.0.0 diff --git a/java/benchmark/pom.xml b/java/benchmark/pom.xml index 902171b7..b6052471 100644 --- a/java/benchmark/pom.xml +++ b/java/benchmark/pom.xml @@ -26,7 +26,7 @@ fury-parent org.apache.fury -0.5.0-SNAPSHOT +0.5.0-rc2 benchmark diff --git a/java/fury-core/pom.xml b/java/fury-core/pom.xml index 2b2a426c..57267c35 100644 --- a/java/fury-core/pom.xml +++ b/java/fury-core/pom.xml @@ -25,7 +25,7 @@ org.apache.fury fury-parent -0.5.0-SNAPSHOT +0.5.0-rc2 4.0.0 diff --git a/java/fury-format/pom.xml b/java/fury-format/pom.xml index c39d76ec..fd387025 100644 --- a/java/fury-format/pom.xml +++ b/java/fury-format/pom.xml @@ -25,7 +25,7 @@ org.apache.fury fury-parent -0.5.0-SNAPSHOT +0.5.0-rc2 4.0.0 diff --git a/java/fury-test-core/pom.xml b/java/fury-test-core/pom.xml index db081dbf..dea7eb27 100644 --- a/java/fury-test-core/pom.xml +++ b/java/fury-test-core/pom.xml @@ -25,7 +25,7 @@ fury-parent org.apache.fury -0.5.0-SNAPSHOT +0.5.0-rc2 4.0.0 diff --git a/java/fury-testsuite/pom.xml b/java/fury-testsuite/pom.xml index 112d6e1f..3e1e993d 100644 --- a/java/fury-testsuite/pom.xml +++ b/java/fury-testsuite/pom.xml @@ -25,7 +25,7 @@ fury-parent org.apache.fury -0.5.0-SNAPSHOT +0.5.0-rc2 4.0.0 diff --git a/java/pom.xml b/java/pom.xml index bda70865..b9e9b173 100644 --- a/java/pom.xml +++ b/java/pom.xml @@ -33,7 +33,7 @@ org.apache.fury fury-parent pom - 0.5.0-SNAPSHOT + 0.5.0-rc2 Fury Project Parent POM Apache Fury™ is a blazingly fast multi-language serialization framework powered by jit and zero-copy. diff --git a/javascript/packages/fury/package.json b/javascript/packages/fury/package.json index 11b6d777..8841d6ec 100644 --- a/javascript/packages/fury/package.json +++ b/javascript/packages/fury/package.json @@ -1,6 +1,6 @@ { "name": "@furyjs/fury", - "version": "0.5.9-beta", + "version": "0.5.0-rc.2", "description": "Apache Fury™(incubating) is a blazingly fast multi-language serialization framework powered by jit and zero-copy", "main": "dist/index.js", "scripts": { diff --git a/javascript/packages/hps/package.json b/javascript/packages/hps/package.json index db6e4c91..362dd79e 100644 --- a/javascript
(incubator-fury) tag 0.5.0-rc2 created (now 3f30419a)
This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a change to tag 0.5.0-rc2 in repository https://gitbox.apache.org/repos/asf/incubator-fury.git at 3f30419a (commit) This tag includes the following new commits: new 3f30419a bump version to 0.5.0-rc2 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [I] 【bug】Caused by: org.graalvm.compiler.debug.GraalError: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: An object of type 'ch.qos.logback.core.status.InfoStatus' was found in
chaokunyang closed issue #1404: 【bug】Caused by: org.graalvm.compiler.debug.GraalError: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: An object of type 'ch.qos.logback.core.status.InfoStatus' was found in the image heap URL: https://github.com/apache/incubator-fury/issues/1404 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
(incubator-fury) branch main updated: fix(java): fix slf4j on graalvm (#1432)
This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/incubator-fury.git The following commit(s) were added to refs/heads/main by this push: new 5607cd70 fix(java): fix slf4j on graalvm (#1432) 5607cd70 is described below commit 5607cd7015eed0156d68debe4bfc8a288f625075 Author: Shawn Yang AuthorDate: Thu Mar 28 17:49:53 2024 +0800 fix(java): fix slf4j on graalvm (#1432) This PR closes #1404 by: - Upgrade slf4j to 2.0.12 - Using print instead of slf4j for graalvm --- ci/run_ci.sh | 2 +- .../java/org/apache/fury/util/LoggerFactory.java | 62 ++ .../fury-core/native-image.properties | 1 + java/fury-test-core/pom.xml| 2 +- java/pom.xml | 2 +- 5 files changed, 66 insertions(+), 3 deletions(-) diff --git a/ci/run_ci.sh b/ci/run_ci.sh index 61caaab3..30bbe44b 100755 --- a/ci/run_ci.sh +++ b/ci/run_ci.sh @@ -99,7 +99,7 @@ graalvm_test() { mvn -T10 -B --no-transfer-progress clean install -DskipTests echo "Start to build graalvm native image" cd "$ROOT"/integration_tests/graalvm_tests - mvn -DskipTests=true -Pnative package + mvn -DskipTests=true --no-transfer-progress -Pnative package echo "Built graalvm native image" echo "Start to run graalvm native image" ./target/main diff --git a/java/fury-core/src/main/java/org/apache/fury/util/LoggerFactory.java b/java/fury-core/src/main/java/org/apache/fury/util/LoggerFactory.java index 57e56f5a..01f0615c 100644 --- a/java/fury-core/src/main/java/org/apache/fury/util/LoggerFactory.java +++ b/java/fury-core/src/main/java/org/apache/fury/util/LoggerFactory.java @@ -19,6 +19,11 @@ package org.apache.fury.util; +import java.lang.reflect.InvocationHandler; +import java.lang.reflect.Method; +import java.lang.reflect.Proxy; +import java.time.LocalDateTime; +import java.time.format.DateTimeFormatter; import org.slf4j.Logger; import org.slf4j.helpers.NOPLogger; @@ -40,7 +45,64 @@ public class LoggerFactory { if (disableLogging) { return NOPLogger.NOP_LOGGER; } else { + if (GraalvmSupport.IN_GRAALVM_NATIVE_IMAGE) { +return (Logger) +Proxy.newProxyInstance( +clazz.getClassLoader(), new Class[] {Logger.class}, new GraalvmLogger(clazz)); + } return org.slf4j.LoggerFactory.getLogger(clazz); } } + + private static final class GraalvmLogger implements InvocationHandler { +private static final DateTimeFormatter dateTimeFormatter = +DateTimeFormatter.ofPattern("-MM-dd hh:mm:ss"); +private final Class targetClass; + +private GraalvmLogger(Class targetClass) { + this.targetClass = targetClass; +} + +@Override +public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { + String name = method.getName(); + switch (name) { +case "isEnabledForLevel": +case "isInfoEnabled": +case "isWarnEnabled": +case "isErrorEnabled": + return true; +case "info": + log("INFO", false, args); + return null; +case "warn": + log("WARN", false, args); + return null; +case "error": + log("ERROR", false, args); + return null; +default: + return method.invoke(NOPLogger.NOP_LOGGER, args); + } +} + +private void log(String level, boolean mayPrintTrace, Object[] args) { + StringBuilder builder = new StringBuilder(dateTimeFormatter.format(LocalDateTime.now())); + builder.append(" ").append(level); + builder.append(" ").append(targetClass.getSimpleName()); + builder.append(" [").append(Thread.currentThread().getName()).append(']'); + builder.append(" -"); + for (Object arg : args) { +builder.append(" ").append(arg); + } + System.out.println(builder); + int length = args.length; + if (mayPrintTrace && length > 0) { +Object o = args[length - 1]; +if (o instanceof Throwable) { + ((Throwable) o).printStackTrace(); +} + } +} + } } diff --git a/java/fury-core/src/main/resources/META-INF/native-image/org.apache.fury/fury-core/native-image.properties b/java/fury-core/src/main/resources/META-INF/native-image/org.apache.fury/fury-core/native-image.properties index a8b4f60d..74f1304e 100644 --- a/java/fury-core/src/main/resources/META-INF/native-image/org.apache.fury/fury-core/native-image.properties +++ b/java/fury-core/src/main/resources/META-INF/native-image/org.apache.fury/fury-core/native-image.properties @@ -106,6 +106,7 @@ Args=--initialize-at-build-time=org.apache.fury.memory.MemoryBuffer,\ org.apache.fury.shaded.org.codehaus.janino.Java$Invocation,\ org.apache.fury.shaded.org.codehaus.janino.ReflectionI
Re: [PR] fix(java): fix slf4j on graalvm [incubator-fury]
chaokunyang merged PR #1432: URL: https://github.com/apache/incubator-fury/pull/1432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[GH] (incubator-fury): Workflow run "Fury CI" is working again!
The GitHub Actions job "Fury CI" on incubator-fury.git has succeeded. Run started by GitHub user chaokunyang (triggered by chaokunyang). Head commit for run: 0e23cbb0aa0ed1148e35ca319841190805f6e092 / chaokunyang lint code Report URL: https://github.com/apache/incubator-fury/actions/runs/8465359294 With regards, GitHub Actions via GitBox - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[GH] (incubator-fury): Workflow run "Fury CI" failed!
The GitHub Actions job "Fury CI" on incubator-fury.git has failed. Run started by GitHub user chaokunyang (triggered by chaokunyang). Head commit for run: 81e3ce7caa9b4439d6b2303b9437d27ca40ec036 / chaokunyang add graalvm logger init on build time Report URL: https://github.com/apache/incubator-fury/actions/runs/8465304561 With regards, GitHub Actions via GitBox - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[GH] (incubator-fury): Workflow run "Fury CI" failed!
The GitHub Actions job "Fury CI" on incubator-fury.git has failed. Run started by GitHub user chaokunyang (triggered by chaokunyang). Head commit for run: 502728503216d4c9d89e3f69f0b0f9918bb93fd8 / chaokunyang fix logger for graalvm Report URL: https://github.com/apache/incubator-fury/actions/runs/8465249383 With regards, GitHub Actions via GitBox - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[GH] (incubator-fury): Workflow run "Fury CI" failed!
The GitHub Actions job "Fury CI" on incubator-fury.git has failed. Run started by GitHub user chaokunyang (triggered by chaokunyang). Head commit for run: 6b5f9d9650a1cf7978807f758bff18e8d0067cd4 / chaokunyang using print instead of slf4j for graalvm Report URL: https://github.com/apache/incubator-fury/actions/runs/8465153814 With regards, GitHub Actions via GitBox - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[GH] (incubator-fury): Workflow run "Fury CI" failed!
The GitHub Actions job "Fury CI" on incubator-fury.git has failed. Run started by GitHub user chaokunyang (triggered by chaokunyang). Head commit for run: ce5aeb455431317c185460fad4a00a8bcbcc81ae / chaokunyang upgrade slf4j to 2.0.12 Report URL: https://github.com/apache/incubator-fury/actions/runs/8464208911 With regards, GitHub Actions via GitBox - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[PR] chore(java): upgrade slf4j to 2.0.12 [incubator-fury]
chaokunyang opened a new pull request, #1432: URL: https://github.com/apache/incubator-fury/pull/1432 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
(incubator-fury) branch main updated: fix(java): fix bigdecimal serializer (#1431)
This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/incubator-fury.git The following commit(s) were added to refs/heads/main by this push: new 552124ed fix(java): fix bigdecimal serializer (#1431) 552124ed is described below commit 552124edd3c9cbbc512c15ba3a6974ae4cdd0b22 Author: Shawn Yang AuthorDate: Thu Mar 28 16:02:11 2024 +0800 fix(java): fix bigdecimal serializer (#1431) The bigdecimal serialization doesn't pass precison, this PR fixed it --- .../main/java/org/apache/fury/serializer/Serializers.java | 14 -- .../java/org/apache/fury/serializer/SerializersTest.java | 5 + 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/java/fury-core/src/main/java/org/apache/fury/serializer/Serializers.java b/java/fury-core/src/main/java/org/apache/fury/serializer/Serializers.java index 19829c97..5af51591 100644 --- a/java/fury-core/src/main/java/org/apache/fury/serializer/Serializers.java +++ b/java/fury-core/src/main/java/org/apache/fury/serializer/Serializers.java @@ -30,6 +30,7 @@ import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.math.BigDecimal; import java.math.BigInteger; +import java.math.MathContext; import java.net.URI; import java.nio.charset.Charset; import java.util.Currency; @@ -338,19 +339,20 @@ public class Serializers { @Override public void write(MemoryBuffer buffer, BigDecimal value) { final byte[] bytes = value.unscaledValue().toByteArray(); - Preconditions.checkArgument(bytes.length <= 16); - buffer.writeByte((byte) value.scale()); - buffer.writeByte((byte) bytes.length); + buffer.writePositiveVarInt(value.scale()); + buffer.writePositiveVarInt(value.precision()); + buffer.writePositiveVarInt(bytes.length); buffer.writeBytes(bytes); } @Override public BigDecimal read(MemoryBuffer buffer) { - int scale = buffer.readByte(); - int len = buffer.readByte(); + int scale = buffer.readPositiveVarInt(); + int precision = buffer.readPositiveVarInt(); + int len = buffer.readPositiveVarInt(); byte[] bytes = buffer.readBytes(len); final BigInteger bigInteger = new BigInteger(bytes); - return new BigDecimal(bigInteger, scale); + return new BigDecimal(bigInteger, scale, new MathContext(precision)); } } diff --git a/java/fury-core/src/test/java/org/apache/fury/serializer/SerializersTest.java b/java/fury-core/src/test/java/org/apache/fury/serializer/SerializersTest.java index 7a0a4111..0899981a 100644 --- a/java/fury-core/src/test/java/org/apache/fury/serializer/SerializersTest.java +++ b/java/fury-core/src/test/java/org/apache/fury/serializer/SerializersTest.java @@ -25,6 +25,7 @@ import static org.testng.Assert.assertTrue; import java.math.BigDecimal; import java.math.BigInteger; +import java.math.MathContext; import java.net.URI; import java.net.URISyntaxException; import java.nio.charset.Charset; @@ -104,6 +105,10 @@ public class SerializersTest extends FuryTestBase { Fury fury2 = builder.build(); assertEquals(BigInteger.valueOf(100), serDe(fury1, fury2, BigInteger.valueOf(100))); assertEquals(BigDecimal.valueOf(100, 2), serDe(fury1, fury2, BigDecimal.valueOf(100, 2))); +BigInteger bigInteger = new BigInteger(""); +BigDecimal bigDecimal = new BigDecimal(bigInteger, 200, MathContext.DECIMAL128); +BigDecimal bigDecimal1 = serDe(fury1, bigDecimal); +assertEquals(bigDecimal1, bigDecimal); } @Test(dataProvider = "javaFury") - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] fix(java): fix bigdecimal serializer [incubator-fury]
chaokunyang merged PR #1431: URL: https://github.com/apache/incubator-fury/pull/1431 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] fix(license): add DISCLAIMER and NOTICE for built jars [incubator-fury]
chaokunyang commented on PR #1430: URL: https://github.com/apache/incubator-fury/pull/1430#issuecomment-2024611135 > Lgtm. When we get to the Incubator vote, other PMC members might ask for more changes but this should be alright. OK, I will send a new release vote later -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
(incubator-fury) branch main updated: fix(license): add DISCLAIMER and NOTICE for built jars (#1430)
This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/incubator-fury.git The following commit(s) were added to refs/heads/main by this push: new 2775ab26 fix(license): add DISCLAIMER and NOTICE for built jars (#1430) 2775ab26 is described below commit 2775ab267162c2ce31933536b01cd04737b12539 Author: Shawn Yang AuthorDate: Thu Mar 28 15:54:44 2024 +0800 fix(license): add DISCLAIMER and NOTICE for built jars (#1430) This PR adds DISCLAIMER and NOTICE for built jars to comply with ASF release policy --- .../src/main/resources/META-INF/DISCLAIMER | 10 java/fury-core/src/main/resources/META-INF/NOTICE | 68 ++ .../resources/META-INF/licenses/LICENSE-janino.txt | 31 ++ .../resources/META-INF/licenses/LICENSE-kryo.txt | 10 .../src/main/resources/META-INF/DISCLAIMER | 10 .../fury-format/src/main/resources/META-INF/NOTICE | 23 6 files changed, 152 insertions(+) diff --git a/java/fury-core/src/main/resources/META-INF/DISCLAIMER b/java/fury-core/src/main/resources/META-INF/DISCLAIMER new file mode 100644 index ..03c04ea9 --- /dev/null +++ b/java/fury-core/src/main/resources/META-INF/DISCLAIMER @@ -0,0 +1,10 @@ +Apache Fury (incubating) is an effort undergoing incubation at the Apache +Software Foundation (ASF), sponsored by the Apache Incubator PMC. + +Incubation is required of all newly accepted projects until a further review +indicates that the infrastructure, communications, and decision making process +have stabilized in a manner consistent with other successful ASF projects. + +While incubation status is not necessarily a reflection of the completeness +or stability of the code, it does indicate that the project has yet to be +fully endorsed by the ASF. diff --git a/java/fury-core/src/main/resources/META-INF/NOTICE b/java/fury-core/src/main/resources/META-INF/NOTICE new file mode 100644 index ..f04b4fdf --- /dev/null +++ b/java/fury-core/src/main/resources/META-INF/NOTICE @@ -0,0 +1,68 @@ +Apache Fury (Incubating) +Copyright 2023-2024 The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). + + + +This product includes a number of Dependencies with separate copyright notices +and license terms. Your use of these submodules is subject to the terms and +conditions of the following licenses. + + + + +Apache-2.0 licenses + +The following components are provided under the Apache-2.0 License. See project link for details. +The text of each license is the standard Apache 2.0 license. + +* guava (https://github.com/google/guava) +Files: + java/fury-core/src/main/java/org/apache/fury/util/Preconditions.java + +* spark (https://github.com/apache/spark) +Files: + java/fury-core/src/main/java/org/apache/fury/codegen/Code.java + java/fury-core/src/main/java/org/apache/fury/util/Platform.java + +* flink (https://github.com/apache/flink) +Files: + java/fury-core/src/main/java/org/apache/fury/memory/MemoryBuffer.java + +* commons-io (https://github.com/apache/commons-io) +Files: + java/fury-core/src/main/java/org/apache/fury/io/ClassLoaderObjectInputStream.java + + + +BSD-3-Clause licenses + +The following components are provided under the BSD-3-Clause License. See project link for details. +The text of each license is also included in licenses/LICENSE-[project].txt. + +* kryo (https://github.com/EsotericSoftware/kryo) +Files: + java/fury-core/src/main/java/org/apache/fury/collection/FuryObjectMap.java + java/fury-core/src/main/java/org/apache/fury/collection/IdentityMap.java + java/fury-core/src/main/java/org/apache/fury/collection/IdentityObjectIntMap.java + java/fury-core/src/main/java/org/apache/fury/collection/LongMap.java + java/fury-core/src/main/java/org/apache/fury/collection/ObjectIntMap.java + java/fury-core/src/main/java/org/apache/fury/type/Generics.java + +* janino (https://github.com/janino-compiler/janino) +Files: + Shaded classes under org/apache/fury/shaded/org/codehaus/janino/* + + + +Public Domain + +The following components are placed in the public domain. +The author hereby disclaims copyright to this source code. +See project link for details. + +* java_util (https://github.com/yonik/java_util) +Files: +
Re: [PR] fix(license): add DISCLAIMER and NOTICE for built jars [incubator-fury]
chaokunyang merged PR #1430: URL: https://github.com/apache/incubator-fury/pull/1430 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
[PR] fix(java): fix bigdecimal serializer [incubator-fury]
chaokunyang opened a new pull request, #1431: URL: https://github.com/apache/incubator-fury/pull/1431 The bigdecimal serialization doesn't pass precison, this PR fixed it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] fix: use blazingly instead of blazing [incubator-fury]
chaokunyang commented on PR #1428: URL: https://github.com/apache/incubator-fury/pull/1428#issuecomment-2024578699 Great, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org
Re: [PR] fix: use blazingly instead of blazing [incubator-fury]
pjfanning commented on PR #1428: URL: https://github.com/apache/incubator-fury/pull/1428#issuecomment-2024571055 The .asf.yaml file is used to update GitHub settings like the project description. I think the value is updated now because I included a change in .asf.yaml in this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org For additional commands, e-mail: commits-h...@fury.apache.org