Re: [I] Remove commented content in git commit message [fury]

2024-07-03 Thread via GitHub


urlyy commented on issue #1588:
URL: https://github.com/apache/fury/issues/1588#issuecomment-2208087225

   That's strange. Do you copy the PR body into the merge description when 
confirming a merge? If so, I think we can modify the PR body automatically 
after people create the PR. I found a GitHub Actions plugin: [Update PR 
Description](https://github.com/marketplace/actions/update-pr-description) that 
works as follows.
   
![image](https://github.com/apache/fury/assets/61675635/23d50839-ac6b-40ce-b1c0-ca4372178725)
   and it works on my demo
   
![image](https://github.com/apache/fury/assets/61675635/78ad863e-7263-435b-8991-b883530858cc)
   However, for this issue, it might be necessary to create a custom action 
plugin. I'd like to take this on, but I'm not very familiar with writing action 
plugins. Could you allow me a few days to study this and then we can continue 
the discussion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



Re: [D] the pyfury demo code shown in the website doesn't work with pyfury :( [fury]

2024-07-03 Thread via GitHub


GitHub user justincui edited a comment on the discussion: the pyfury demo code 
shown in the website doesn't work with pyfury :(

with the following setting:
```
python --version 
Python 3.9.16

pip list
Package Version
--- ---
cloudpickle 3.0.0
numpy   1.26.4
pip 24.1.1
pyarrow 14.0.2
pyfury  0.4.1
setuptools  53.0.0
```
The following code can be run:
```
Python 3.9.16 (main, Dec  8 2022, 00:00:00) 
[GCC 11.3.1 20221121 (Red Hat 11.3.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyfury
>>> import numpy as np
>>> fury = pyfury.Fury()
>>> object_list = [True, False, "str", -1.1, 1,
...np.full(100, 0, dtype=np.int32), np.full(20, 0.0, 
dtype=np.double)]
>>> data = fury.serialize(object_list)
>>> new_list = fury.deserialize(data)
>>> object_map = {"k1": "v1", "k2": object_list, "k3": -1}
>>> data = fury.serialize(object_map)
>>> # bytes can be data serialized by other languages.
>>> new_map = fury.deserialize(data)
>>> print(new_map)
{'k1': 'v1', 'k2': [True, False, 'str', -1.1, 1, array([0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32), array([0., 0., 0., 
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
   0., 0., 0.])], 'k3': -1}
```
but the following code still core dump:
```
Python 3.9.16 (main, Dec  8 2022, 00:00:00) 
[GCC 11.3.1 20221121 (Red Hat 11.3.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from typing import Dict
>>> import pyfury
>>> 
>>> class SomeClass:
... f1: "SomeClass"
... f2: Dict[str, str]
... f3: Dict[str, str]
... 
>>> fury = pyfury.Fury(ref_tracking=True)
>>> fury.register_class(SomeClass)
>>> obj = SomeClass()
>>> obj.f2 = {"k1": "v1", "k2": "v2"}
>>> obj.f1, obj.f3 = obj, obj.f2
>>> data = fury.serialize(obj)
Segmentation fault (core dumped)
```

GitHub link: 
https://github.com/apache/fury/discussions/1719#discussioncomment-9953030


This is an automatically sent email for commits@fury.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@fury.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



Re: [D] the pyfury demo code shown in the website doesn't work with pyfury :( [fury]

2024-07-03 Thread via GitHub


GitHub user justincui edited a comment on the discussion: the pyfury demo code 
shown in the website doesn't work with pyfury :(

with the following setting:
```
python --version 
Python 3.9.16

pip list
Package Version
--- ---
cloudpickle 3.0.0
numpy   1.26.4
pip 24.1.1
pyarrow 14.0.2
pyfury  0.4.1
setuptools  53.0.0
```
The following code can be run:
```
import pyfury
import numpy as np

fury = pyfury.Fury()
object_list = [True, False, "str", -1.1, 1,
   np.full(100, 0, dtype=np.int32), np.full(20, 0.0, 
dtype=np.double)]
data = fury.serialize(object_list)
# bytes can be data serialized by other languages.
new_list = fury.deserialize(data)
object_map = {"k1": "v1", "k2": object_list, "k3": -1}
data = fury.serialize(object_map)
# bytes can be data serialized by other languages.
new_map = fury.deserialize(data)
print(new_map)
```
but the following code still core dump:
```
Python 3.9.16 (main, Dec  8 2022, 00:00:00) 
[GCC 11.3.1 20221121 (Red Hat 11.3.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from typing import Dict
>>> import pyfury
>>> 
>>> class SomeClass:
... f1: "SomeClass"
... f2: Dict[str, str]
... f3: Dict[str, str]
... 
>>> fury = pyfury.Fury(ref_tracking=True)
>>> fury.register_class(SomeClass)
>>> obj = SomeClass()
>>> obj.f2 = {"k1": "v1", "k2": "v2"}
>>> obj.f1, obj.f3 = obj, obj.f2
>>> data = fury.serialize(obj)
Segmentation fault (core dumped)
```

GitHub link: 
https://github.com/apache/fury/discussions/1719#discussioncomment-9953030


This is an automatically sent email for commits@fury.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@fury.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



Re: [D] the pyfury demo code shown in the website doesn't work with pyfury :( [fury]

2024-07-03 Thread via GitHub


GitHub user justincui added a comment to the discussion: the pyfury demo code 
shown in the website doesn't work with pyfury :(

finally I found the correct setting:
`pip list`
```
Package Version
--- ---
cloudpickle 3.0.0
numpy   1.26.4
pip 24.1.1
pyarrow 14.0.2
pyfury  0.4.1
setuptools  53.0.0

```

GitHub link: 
https://github.com/apache/fury/discussions/1719#discussioncomment-9953030


This is an automatically sent email for commits@fury.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@fury.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



Re: [I] [Golang] Implement xlang string for furygo [fury]

2024-07-03 Thread via GitHub


urlyy commented on issue #1705:
URL: https://github.com/apache/fury/issues/1705#issuecomment-2205887968

   Hi, I'm insterested in this issue and now learning the xlang string format 
as follows. But It seems I haven't found the `string.go` file. Do I need to 
create a new file named `string.go` and implement the `func (s 
stringSerializer) Read(...)` and `func (s stringSerializer) Write(...)`?
   
![image](https://github.com/apache/fury/assets/61675635/26950663-0283-47d0-bc97-c5a075242d79)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



Re: [PR] feat(javascript): Meta string encoding algorithm for JavaScript [fury]

2024-07-03 Thread via GitHub


theweipeng commented on PR #1675:
URL: https://github.com/apache/fury/pull/1675#issuecomment-2205736393

   > Hi, @theweipeng , Please have a look at my PR. I have been trying to 
adjust the object.test.ts files so as to pass the pipline tests but i really 
don't know the `writeStmt` and `readStmt` functions are tested in the 
objects.test.ts files. Please help me with a few hints.
   
   First, let me explain why we need to generate code, which is necessary for a 
general understanding.
   
   Without generating code for each object type, the JavaScript would look like 
this:
   ```JavaScript
function writeObject(input, writer, description) {
description.fields.forEach(field => {
const binary = metaString.encode(field.name);
writer.write(binary);
});
description.fields.forEach(field => {
writeXxxx(input[field.name], writer, field.description);
});
}
```
   In this pseudo-code, the function writeObject calls metaString.encode every 
time and iterating over description.fields is unnecessary. Performance is very 
important for Fury.
   
   If we generate code for each object type, the JavaScript will look like this:
   ```JavaScript
const metaBinary = description.fields.map(field => 
metaString.encode(field.name)).join(/* code to join the binary data */);
function writeObject(input, writer) {
writer.write(metaBinary);
writeString(input.b);
writeBoolean(input.c);
}
```
   In this pseudo-code, metaBinary is called only once when writeObject is 
created, and we can avoid all overhead except for the necessary write buffer.
   
   writeStmt is used to generate a code fragment that will be embedded by the 
function toSerializer, declared in BaseSerializerGenerator.
   
   For example, a description like the one below will generate code when 
registerSerializer is called:
   ```JavaScript
const description = Type.object('example.foo', {
b: Type.string(),
c: Type.bool()
});

const fury = new Fury({ refTracking: true });
const serializer = fury.registerSerializer(description).serializer;
   ```
   The call chain will be generateSerializer(gen/index.ts) -> 
generate(gen/index.ts) -> generator.toSerializer(gen/index.ts) -> 
toSerializer(gen/serializer.ts) -> writeStmt(gen/object.ts). `funcString` 
declared in generate(gen/index.ts) store the final code. 
   
   writeStmt iterates through the properties defined in the description (in the 
example, they are b and c) and generates the code based on the property's 
description.
   
   In this example, prop b will call writeStmt declared in gen/string.ts.
   
   You can add a script for debugging like this:
   
   
https://github.com/apache/fury/assets/16490211/2c1f7b34-a4bd-4774-bc26-556b80a89c02
   
   And click Debug:
   
   
https://github.com/apache/fury/assets/16490211/77d67ae8-042a-4ae1-b878-9de39777fb49
   
   The test will pause at the breakpoint, allowing you to trace the call chain 
by debugging the code generator.
   
   In your case, you should create a write instance for meta inside a closure, 
thus allowing direct writing of the meta binary.
   
   For instance:
   ```JavaScript
   const metaBinary = this.scope.declare("metaBinary", 
`${metaString.encode(JSON.stringify(description))}`); // This line declares a 
variable inside a closure.
   return `
   ${this.builder.writer.binary(metaBinary)};
   `;
   ```
   tagWriter is in a similar situation; this should be helpful to you:
   
   
https://github.com/apache/fury/assets/16490211/401e9f36-09c6-4204-9102-b115fac9631a


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



[GH] (fury): Workflow run "Fury CI" failed!

2024-07-03 Thread GitBox


The GitHub Actions job "Fury CI" on fury.git has failed.
Run started by GitHub user zhaoomo (triggered by zhaoomo).

Head commit for run:
39b481a323c574d3f0539f4416913ffd2613cc35 / zhaoomo 
Merge remote-tracking branch 'origin/main' into 
feat/java_fast_object_copy_framework

Report URL: https://github.com/apache/fury/actions/runs/9775739257

With regards,
GitHub Actions via GitBox


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



[GH] (fury): Workflow run "Fury CI" is working again!

2024-07-03 Thread GitBox


The GitHub Actions job "Fury CI" on fury.git has succeeded.
Run started by GitHub user theweipeng (triggered by theweipeng).

Head commit for run:
d25ccbb1803f137cc5e9bf15fa060ae67c5be7ad / 野声 
feat: use TextDecoder to decode buffer (#1699)



## What does this PR do?



Use the browser builtin decode utils: TextDecoder

https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder

This can improve a lot of performance. see the diff of the benchmark.

and the original implementation will cause many `minor gc` in
javascript, the longer the string, the longer the GC time.

## Related issues



## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?


## Benchmark



![CleanShot 2024-06-25 at 17 04
45@2x](https://github.com/apache/fury/assets/13938334/29517aa9-563f-46ba-9aec-563d49c8db9a)

Report URL: https://github.com/apache/fury/actions/runs/9775275451

With regards,
GitHub Actions via GitBox


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



Re: [PR] feat: use TextDecoder to decode buffer [fury]

2024-07-03 Thread via GitHub


theweipeng merged PR #1699:
URL: https://github.com/apache/fury/pull/1699


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



(fury) branch main updated: feat: use TextDecoder to decode buffer (#1699)

2024-07-03 Thread wangweipeng
This is an automated email from the ASF dual-hosted git repository.

wangweipeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fury.git


The following commit(s) were added to refs/heads/main by this push:
 new d25ccbb1 feat: use TextDecoder to decode buffer (#1699)
d25ccbb1 is described below

commit d25ccbb1803f137cc5e9bf15fa060ae67c5be7ad
Author: 野声 
AuthorDate: Wed Jul 3 17:23:28 2024 +0800

feat: use TextDecoder to decode buffer (#1699)



## What does this PR do?



Use the browser builtin decode utils: TextDecoder

https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder

This can improve a lot of performance. see the diff of the benchmark.

and the original implementation will cause many `minor gc` in
javascript, the longer the string, the longer the GC time.

## Related issues



## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?


## Benchmark



![CleanShot 2024-06-25 at 17 04

45@2x](https://github.com/apache/fury/assets/13938334/29517aa9-563f-46ba-9aec-563d49c8db9a)
---
 javascript/packages/fury/lib/platformBuffer.ts |  25 -
 platform-buffer.jpg| Bin 0 -> 37116 bytes
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/javascript/packages/fury/lib/platformBuffer.ts 
b/javascript/packages/fury/lib/platformBuffer.ts
index b28df55b..e3a0978d 100644
--- a/javascript/packages/fury/lib/platformBuffer.ts
+++ b/javascript/packages/fury/lib/platformBuffer.ts
@@ -19,6 +19,9 @@
 
 import { hasBuffer } from "./util";
 
+let utf8Encoder: TextEncoder | null;
+let textDecoder: TextDecoder | null;
+
 export type SupportedEncodings = "latin1" | "utf8";
 
 export interface PlatformBuffer extends Uint8Array {
@@ -96,22 +99,12 @@ export class BrowserBuffer extends Uint8Array implements 
PlatformBuffer {
 if (end - start < 1) {
   return "";
 }
-let str = "";
-for (let i = start; i < end;) {
-  const t = this[i++];
-  if (t <= 0x7F) {
-str += String.fromCharCode(t);
-  } else if (t >= 0xC0 && t < 0xE0) {
-str += String.fromCharCode((t & 0x1F) << 6 | this[i++] & 0x3F);
-  } else if (t >= 0xE0 && t < 0xF0) {
-str += String.fromCharCode((t & 0xF) << 12 | (this[i++] & 0x3F) << 6 | 
this[i++] & 0x3F);
-  } else if (t >= 0xF0) {
-const t2 = ((t & 7) << 18 | (this[i++] & 0x3F) << 12 | (this[i++] & 
0x3F) << 6 | this[i++] & 0x3F) - 0x1;
-str += String.fromCharCode(0xD800 + (t2 >> 10));
-str += String.fromCharCode(0xDC00 + (t2 & 0x3FF));
-  }
+
+if (!textDecoder) {
+  textDecoder = new TextDecoder("utf-8");
 }
-return str;
+
+return textDecoder.decode(this.subarray(start, end));
   }
 
   copy(target: Uint8Array, targetStart?: number, sourceStart?: number, 
sourceEnd?: number) {
@@ -153,8 +146,6 @@ export const alloc = (hasBuffer ? Buffer.allocUnsafe : 
BrowserBuffer.alloc) as u
 
 export const strByteLength = hasBuffer ? Buffer.byteLength : 
BrowserBuffer.byteLength;
 
-let utf8Encoder: TextEncoder | null;
-
 export const fromString
 = hasBuffer
   ? (str: string) => Buffer.from(str) as unknown as PlatformBuffer
diff --git a/platform-buffer.jpg b/platform-buffer.jpg
new file mode 100644
index ..0b69b424
Binary files /dev/null and b/platform-buffer.jpg differ


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



[GH] (fury): Workflow run "Fury CI" failed!

2024-07-03 Thread GitBox


The GitHub Actions job "Fury CI" on fury.git has failed.
Run started by GitHub user pandalee99 (triggered by pandalee99).

Head commit for run:
029538341a8ebba58e73bfc0cdc47a20f5f98824 / pandalee99 <1162953...@qq.com>
fix

Report URL: https://github.com/apache/fury/actions/runs/9774150261

With regards,
GitHub Actions via GitBox


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org



[PR] feat(C++): String detection is performed using SIMD techniques [fury]

2024-07-03 Thread via GitHub


pandalee99 opened a new pull request, #1720:
URL: https://github.com/apache/fury/pull/1720

   ## What does this PR do?
   ref: https://arxiv.org/pdf/1902.08318.pdf
   ref: https://github.com/simdutf/simdutf
   I learned about the related simd technology, as well as this paper and 
project implementation.
   Using SIMD technique for string detection.
   First, I need to implement the logic and complete the latin character 
detection
   ``` c++
   // Baseline implementation
   bool isLatin_Baseline(const std::string& str) {
   for (char c : str) {
   if (static_cast(c) >= 128) {
   return false;
   }
   }
   return true;
   }
   ```
   https://raw.githubusercontent.com/pandalee99/image_store/master/hexo/simd_base_line_test1.png";>
   Then, I tried to use SSE2 to speed it up, which is obviously a little bit 
faster, the logic is to read multiple characters at once and then do the bit 
arithmetic
   Obviously, there was a speed boost, but I didn't think it was enough, so I 
tried it again with AVX2
   https://raw.githubusercontent.com/pandalee99/image_store/master/hexo/simd_test_all_1.png";>
   I think in terms of efficiency, it's already much faster than before. 
   But how do you prove that it's also logically true?
   I added test samples to verify
   
   ``` C++
   TEST(StringUtilTest, TestIsLatinLogic)
   ```
   
   Finally, I ran the test
   https://raw.githubusercontent.com/pandalee99/image_store/master/hexo/simd_ubantu_test_1.png";>
   done.
   
   
   
   
   
   ## Related issues
   Closes #313 
   
   
   
   
   ## Does this PR introduce any user-facing change?
   
   
   
   - [ ] Does this PR introduce any public API change?
   - [ ] Does this PR introduce any binary protocol compatibility change?
   
   
   ## Benchmark
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@fury.apache.org
For additional commands, e-mail: commits-h...@fury.apache.org