[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841541780 I'm closing this, as we agree that we don't want to remove the TextEncoder API. Additional ideas for speedup: Encode: Use TextEncoder..encodeInto() and pre-allocate

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841533090 What do you think about shipping different entry files for node and browser and solve this issue compile time? -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841529879 Performance: I ran microbenchmarks for TextEncoder vs Buffer+utils and Buffer based is 50% percent faster for small, ascii strings. As we are using larger strings or we switch t

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841526434 @domoritz what's happening? Am I reading the code wrong and it's actually hitting https://github.com/apache/arrow/blob/master/js/src/util/utf8.ts#L35-L38 somehow, or is it just

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841520624 During bencharking it I realized that we already use TextEncoder and TextDecoder only. `useNativeEncoders` is always evaluated to `true` in all supported Node versions and it

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841514959 Dictionary may work as well, indeed. My understanding was that the buffer -> string conversions happen only once for every dictionary value and it's cloned later. -- This is

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841512097 @domoritz I agree that some benchmarks should be added, I'll took into that this weekend. -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841511834 https://github.com/apache/arrow/blob/master/js/test/data/tables/generate.py isn't this the test data generator script? I see only float and categorical here, no string.

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-04-29 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-829692597 @trxcllnt the full suite ran in 359s vs 360s. Do you have specific testcase I should check or are you referring to microbenchmarks? I can do both -- This is an automated messa

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-04-28 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-828233314 Sure, I'll do that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-04-28 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-828219343 @kou I saw you removed the release tag 4.0.1. I don't know whether this is a bug or an improvement, however it fixes a regression which was introduced in 4.0.0 so including it i

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-04-28 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-828217551 https://github.com/apache/arrow/blob/3a6f6053c74eb698208395091009ac50be9dc29e/js/package.json#L109 -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-04-27 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-828189114 I saw in package.json it's already required -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov