[GitHub] [arrow] domoritz commented on pull request #10332: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10332: URL: https://github.com/apache/arrow/pull/10332#issuecomment-841602502 @kou can we add this to the 4.0.1 release since it fixes build issues (https://issues.apache.org/jira/browse/ARROW-12734)? -- This is an automated message from the Apache

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841601667 I think there was some misunderstanding here. Despite the performance degration, we can still merge this change as it simplifies the setup and fixes Rollup builds. I re-opened

[GitHub] [arrow] github-actions[bot] commented on pull request #10332: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10332: URL: https://github.com/apache/arrow/pull/10332#issuecomment-841601571 https://issues.apache.org/jira/browse/ARROW-12578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] domoritz opened a new pull request #10332: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz opened a new pull request #10332: URL: https://github.com/apache/arrow/pull/10332 By @alippai Reopened https://github.com/apache/arrow/pull/10181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-datafusion] houqp commented on a change in pull request #55: Support qualified columns in queries

2021-05-14 Thread GitBox
houqp commented on a change in pull request #55: URL: https://github.com/apache/arrow-datafusion/pull/55#discussion_r632895922 ## File path: datafusion/src/logical_plan/plan.rs ## @@ -141,7 +137,7 @@ pub enum LogicalPlan { /// Produces rows from a table provider by

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841599204 String decoding is really really slow. Would you want to help with the materialization (https://issues.apache.org/jira/browse/ARROW-10220) @alippai? -- This is an automated

[GitHub] [arrow] github-actions[bot] commented on pull request #10331: ARROW-12796: [JS] Support JSON output from benchmarks

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10331: URL: https://github.com/apache/arrow/pull/10331#issuecomment-841597553 https://issues.apache.org/jira/browse/ARROW-12796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] domoritz opened a new pull request #10331: ARROW-12796: [JS] Support JSON output from benchmarks

2021-05-14 Thread GitBox
domoritz opened a new pull request #10331: URL: https://github.com/apache/arrow/pull/10331 Supports getting the results as JSON. Here is an example snippet. ```json [ { "name": "name: 'lat', length: 1,000,000, type: Float32, test: gt, value: 0", "ops":

[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #328: fix 305 by using a null array as param for zero param functions

2021-05-14 Thread GitBox
Jimexist commented on a change in pull request #328: URL: https://github.com/apache/arrow-datafusion/pull/328#discussion_r632883017 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1358,6 +1366,18 @@ impl fmt::Display for ScalarFunctionExpr { } } +///

[GitHub] [arrow-datafusion] Jimexist commented on pull request #307: fix 305 by using a scalar uint as param for zero param functions

2021-05-14 Thread GitBox
Jimexist commented on pull request #307: URL: https://github.com/apache/arrow-datafusion/pull/307#issuecomment-841586169 closing this as we are aligned on using `NullArray`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-datafusion] Jimexist closed pull request #307: fix 305 by using a scalar uint as param for zero param functions

2021-05-14 Thread GitBox
Jimexist closed pull request #307: URL: https://github.com/apache/arrow-datafusion/pull/307 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow] github-actions[bot] commented on pull request #10330: ARROW-11673: [C++] Casting dictionary type to use different index type [WIP]

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10330: URL: https://github.com/apache/arrow/pull/10330#issuecomment-841580077 https://issues.apache.org/jira/browse/ARROW-11673 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] edponce opened a new pull request #10330: ARROW-11673: [C++] Casting dictionary type to use different index type [WIP]

2021-05-14 Thread GitBox
edponce opened a new pull request #10330: URL: https://github.com/apache/arrow/pull/10330 This PR provides functionality to cast Dictionary indices between integral types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] nealrichardson closed pull request #10325: Update README.md

2021-05-14 Thread GitBox
nealrichardson closed pull request #10325: URL: https://github.com/apache/arrow/pull/10325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow] nealrichardson commented on a change in pull request #10327: ARROW-12781: [R] Implement is.type() functions for dplyr

2021-05-14 Thread GitBox
nealrichardson commented on a change in pull request #10327: URL: https://github.com/apache/arrow/pull/10327#discussion_r632858178 ## File path: r/R/dplyr-functions.R ## @@ -109,6 +109,39 @@ nse_funcs$as.numeric <- function(x) { Expression$create("cast", x, options =

[GitHub] [arrow] nealrichardson commented on pull request #9615: ARROW-3316: [R] Multi-threaded conversion from R data.frame to Arrow table / record batch

2021-05-14 Thread GitBox
nealrichardson commented on pull request #9615: URL: https://github.com/apache/arrow/pull/9615#issuecomment-841545022 I think this needs rebase, and then that should expose the other place where `Table__from_dots` is called that needs `options_use_threads()` passed to it (which is the

[GitHub] [arrow] alippai edited a comment on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai edited a comment on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841541780 I'm closing this, as we agree that we don't want to remove the Buffer API. Additional ideas for speedup: Encode: Use TextEncoder..encodeInto() and

[GitHub] [arrow] alippai closed pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai closed pull request #10181: URL: https://github.com/apache/arrow/pull/10181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841541780 I'm closing this, as we agree that we don't want to remove the TextEncoder API. Additional ideas for speedup: Encode: Use TextEncoder..encodeInto() and pre-allocate

[GitHub] [arrow] nealrichardson commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-14 Thread GitBox
nealrichardson commented on a change in pull request #10269: URL: https://github.com/apache/arrow/pull/10269#discussion_r632849602 ## File path: r/R/record-batch.R ## @@ -161,7 +161,18 @@ RecordBatch$create <- function(..., schema = NULL) { out <-

[GitHub] [arrow] trxcllnt edited a comment on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
trxcllnt edited a comment on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841535011 We do have [separate entry-points](https://github.com/apache/arrow/blob/master/js/gulp/package-task.js#L49-L51) today, so that's totally fine. We'd need to have some

[GitHub] [arrow] trxcllnt commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
trxcllnt commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841535011 We do have [separate entry-points](https://github.com/apache/arrow/blob/master/js/gulp/package-task.js#L49-L51) today, so that's totally fine. We'd need to have some shared

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841533090 What do you think about shipping different entry files for node and browser and solve this issue compile time? -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow] trxcllnt edited a comment on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
trxcllnt edited a comment on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841530218 Node didn't used to make the TextEncoder/TextDecoder globals, so if we want to support the faster way in node we should figure out a way to control this at the

[GitHub] [arrow] trxcllnt commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
trxcllnt commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841530218 Node didn't used to make the TextEncoder/TextDecoder globals, so if we want to support node we should figure out a way to control this at the packaging or import stages. --

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841529879 Performance: I ran microbenchmarks for TextEncoder vs Buffer+utils and Buffer based is 50% percent faster for small, ascii strings. As we are using larger strings or we switch

[GitHub] [arrow] trxcllnt commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
trxcllnt commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841529749 @alippai if you change the conditional to this and re-run `yarn build -t es2015 -m cjs`, you can force it to use `Buffer.from`: ```ts const useNativeEncoders = !_Buffer

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841527729 Idk what causes the differences. @trxcllnt just ran a benchmark to compare buffer vs no buffer. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841526434 @domoritz what's happening? Am I reading the code wrong and it's actually hitting https://github.com/apache/arrow/blob/master/js/src/util/utf8.ts#L35-L38 somehow, or is it just

[GitHub] [arrow] domoritz edited a comment on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz edited a comment on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841524268 Master Parse "tracks": Table.from x 5,289 ops/sec ±10.58% (76 runs sampled) avg: 0.19ms 1.14% of a frame @ 60FPS Parse "tracks":

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841524268 Master Parse "tracks": Table.from x 5,289 ops/sec ±10.58% (76 runs sampled) avg: 0.19ms 1.14% of a frame @ 60FPS Parse "tracks":

[GitHub] [arrow] github-actions[bot] commented on pull request #10329: [Packaging][Java] Improve JNI jars build [WIP]

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10329: URL: https://github.com/apache/arrow/pull/10329#issuecomment-841522714 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you

[GitHub] [arrow] kszucs opened a new pull request #10329: [Packaging][Java] Improve JNI jars build [WIP]

2021-05-14 Thread GitBox
kszucs opened a new pull request #10329: URL: https://github.com/apache/arrow/pull/10329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] alippai edited a comment on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai edited a comment on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841520624 During benchmarking it I realized that we already use TextEncoder and TextDecoder only. `useNativeEncoders` is always evaluated to `true` in all supported Node

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841520624 During bencharking it I realized that we already use TextEncoder and TextDecoder only. `useNativeEncoders` is always evaluated to `true` in all supported Node versions and

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841519307 Caching keys is https://issues.apache.org/jira/browse/ARROW-10220 (cc @bmschmidt). -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] kou commented on pull request #10300: ARROW-12699: [CI][Packaging][Java] Generate a jar compatible with Linux and MacOS for all Arrow components

2021-05-14 Thread GitBox
kou commented on pull request #10300: URL: https://github.com/apache/arrow/pull/10300#issuecomment-841518230 Great! Can we also generate `.jar` that doesn't need to bundle C++ libraries as the next step? Then we don't need to generate any `.jar` in our release process. -- This is

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841514959 Dictionary may work as well, indeed. My understanding was that the buffer -> string conversions happen only once for every dictionary value and it's cloned later. -- This

[GitHub] [arrow] kou commented on pull request #10325: Update README.md

2021-05-14 Thread GitBox
kou commented on pull request #10325: URL: https://github.com/apache/arrow/pull/10325#issuecomment-841513829 I think that this is a needless change. The original sentence is enough readable. Can we close this? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] trxcllnt commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
trxcllnt commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841512673 @alippai that's the script. The categorical dictionaries are strings and will show up in the output like this: ``` Get "tracks" values by index: name: 'origin',

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841512097 @domoritz I agree that some benchmarks should be added, I'll took into that this weekend. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] alippai commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
alippai commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841511834 https://github.com/apache/arrow/blob/master/js/test/data/tables/generate.py isn't this the test data generator script? I see only float and categorical here, no string.

[GitHub] [arrow-datafusion] alamb closed issue #333: Add easier to understand physical plan printing in `EXPLAIN`

2021-05-14 Thread GitBox
alamb closed issue #333: URL: https://github.com/apache/arrow-datafusion/issues/333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb commented on issue #333: Add easier to understand physical plan printing in `EXPLAIN`

2021-05-14 Thread GitBox
alamb commented on issue #333: URL: https://github.com/apache/arrow-datafusion/issues/333#issuecomment-841511146 Closed with https://github.com/apache/arrow-datafusion/pull/337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow-datafusion] jorgecarleitao merged pull request #330: Make it easier for developers to find Ballista documentation

2021-05-14 Thread GitBox
jorgecarleitao merged pull request #330: URL: https://github.com/apache/arrow-datafusion/pull/330 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-datafusion] jorgecarleitao closed issue #329: Add Ballista Getting Started documentation

2021-05-14 Thread GitBox
jorgecarleitao closed issue #329: URL: https://github.com/apache/arrow-datafusion/issues/329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-datafusion] jorgecarleitao merged pull request #341: Update arrow dependencies again

2021-05-14 Thread GitBox
jorgecarleitao merged pull request #341: URL: https://github.com/apache/arrow-datafusion/pull/341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow] github-actions[bot] commented on pull request #10328: ARROW-12785: [CI] the r-devdocs build errors when brew installing gcc

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10328: URL: https://github.com/apache/arrow/pull/10328#issuecomment-841501243 Revision: aca4a4dcb55436549648dd1309644b979a749ddf Submitted crossbow builds: [ursacomputing/crossbow @

[GitHub] [arrow] jonkeane commented on pull request #10328: ARROW-12785: [CI] the r-devdocs build errors when brew installing gcc

2021-05-14 Thread GitBox
jonkeane commented on pull request #10328: URL: https://github.com/apache/arrow/pull/10328#issuecomment-841500916 @github-actions crossbow submit test-r-devdocs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #10328: ARROW-12785: [CI] the r-devdocs build errors when brew installing gcc

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10328: URL: https://github.com/apache/arrow/pull/10328#issuecomment-841500784 https://issues.apache.org/jira/browse/ARROW-12785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] jonkeane opened a new pull request #10328: ARROW-12785: [CI] the r-devdocs build errors when brew installing gcc

2021-05-14 Thread GitBox
jonkeane opened a new pull request #10328: URL: https://github.com/apache/arrow/pull/10328 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow] edponce commented on a change in pull request #10317: ARROW-12713 [C++] String reverse kernel

2021-05-14 Thread GitBox
edponce commented on a change in pull request #10317: URL: https://github.com/apache/arrow/pull/10317#discussion_r632797486 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -266,6 +266,52 @@ void EnsureLookupTablesFilled() {} #endif // ARROW_WITH_UTF8PROC

[GitHub] [arrow] ianmcook commented on a change in pull request #10327: ARROW-12781: [R] Implement is.type() functions for dplyr

2021-05-14 Thread GitBox
ianmcook commented on a change in pull request #10327: URL: https://github.com/apache/arrow/pull/10327#discussion_r632797152 ## File path: r/R/expression.R ## @@ -76,7 +76,15 @@ Expression <- R6Class("Expression", inherit = ArrowObject, public = list( ToString =

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841497150 I think we should add a benchmark before merging this pull request @trxcllnt. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] ianmcook commented on a change in pull request #10327: ARROW-12781: [R] Implement is.type() functions for dplyr

2021-05-14 Thread GitBox
ianmcook commented on a change in pull request #10327: URL: https://github.com/apache/arrow/pull/10327#discussion_r632795341 ## File path: r/R/expression.R ## @@ -76,7 +76,15 @@ Expression <- R6Class("Expression", inherit = ArrowObject, public = list( ToString =

[GitHub] [arrow] lidavidm closed pull request #10272: ARROW-12677: [Python] Add a mask argument to pyarrow.StructArray.from_arrays

2021-05-14 Thread GitBox
lidavidm closed pull request #10272: URL: https://github.com/apache/arrow/pull/10272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] kszucs closed pull request #10300: ARROW-12699: [CI][Packaging][Java] Generate a jar compatible with Linux and MacOS for all Arrow components

2021-05-14 Thread GitBox
kszucs closed pull request #10300: URL: https://github.com/apache/arrow/pull/10300 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] github-actions[bot] commented on pull request #10327: ARROW-12781: [R] Implement is.type() functions for dplyr

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10327: URL: https://github.com/apache/arrow/pull/10327#issuecomment-841489435 https://issues.apache.org/jira/browse/ARROW-12781 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] ianmcook opened a new pull request #10327: ARROW-12781: [R] Implement is.type() functions for dplyr

2021-05-14 Thread GitBox
ianmcook opened a new pull request #10327: URL: https://github.com/apache/arrow/pull/10327 - adds base R-flavored `is.*()` functions for checking column types in dplyr - adds a `$type_id()` method to `Expression` and to some other R6 classes that were missing it - changes how

[GitHub] [arrow] westonpace commented on a change in pull request #10272: ARROW-12677: [Python] Add a mask argument to pyarrow.StructArray.from_arrays

2021-05-14 Thread GitBox
westonpace commented on a change in pull request #10272: URL: https://github.com/apache/arrow/pull/10272#discussion_r632767457 ## File path: python/pyarrow/array.pxi ## @@ -2189,6 +2227,18 @@ cdef class StructArray(Array): if names is not None and fields is not None:

[GitHub] [arrow-rs] yordan-pavlov commented on issue #200: Use iterators to increase performance of creating Arrow arrays

2021-05-14 Thread GitBox
yordan-pavlov commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-841471236 @jorgecarleitao the int32 support can be split out in a separate PR, I added it now mostly so that I can benchmark how this approach would work for primitive types. -- This

[GitHub] [arrow-rs] alamb commented on pull request #288: manually bump development version

2021-05-14 Thread GitBox
alamb commented on pull request #288: URL: https://github.com/apache/arrow-rs/pull/288#issuecomment-841469082 And the tests are now all  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] lidavidm closed pull request #9656: ARROW-11772: [C++] Provide reentrant IPC file reader

2021-05-14 Thread GitBox
lidavidm closed pull request #9656: URL: https://github.com/apache/arrow/pull/9656 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] lidavidm commented on pull request #9656: ARROW-11772: [C++] Provide reentrant IPC file reader

2021-05-14 Thread GitBox
lidavidm commented on pull request #9656: URL: https://github.com/apache/arrow/pull/9656#issuecomment-841464661 CI passes, so I'll go ahead and merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] lidavidm closed pull request #10324: ARROW-12793: [Python] Fix support for pyarrow debug builds

2021-05-14 Thread GitBox
lidavidm closed pull request #10324: URL: https://github.com/apache/arrow/pull/10324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] domoritz commented on pull request #10181: ARROW-12578: [JS] Remove Buffer in favor of TextEncoder API in NodeJS

2021-05-14 Thread GitBox
domoritz commented on pull request #10181: URL: https://github.com/apache/arrow/pull/10181#issuecomment-841462855 There are discussions about a 4.0.1 release on the mailing list. This pull request would need to get in soon to be included in the release. -- This is an automated message

[GitHub] [arrow] thisisnic commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-14 Thread GitBox
thisisnic commented on a change in pull request #10269: URL: https://github.com/apache/arrow/pull/10269#discussion_r632754954 ## File path: r/R/record-batch.R ## @@ -161,7 +161,18 @@ RecordBatch$create <- function(..., schema = NULL) { out <-

[GitHub] [arrow] rok commented on a change in pull request #9758: ARROW-9054: [C++] Add ScalarAggregateOptions

2021-05-14 Thread GitBox
rok commented on a change in pull request #9758: URL: https://github.com/apache/arrow/pull/9758#discussion_r632754148 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -75,48 +75,55 @@ struct CountImpl : public ScalarAggregator { Status

[GitHub] [arrow] thisisnic commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

2021-05-14 Thread GitBox
thisisnic commented on a change in pull request #10269: URL: https://github.com/apache/arrow/pull/10269#discussion_r632750639 ## File path: r/R/record-batch.R ## @@ -161,7 +161,18 @@ RecordBatch$create <- function(..., schema = NULL) { out <-

[GitHub] [arrow-rs] jorgecarleitao commented on issue #200: Use iterators to increase performance of creating Arrow arrays

2021-05-14 Thread GitBox
jorgecarleitao commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-841450481 ccing @nevi-me since he is the expert here. I'd say let's go for it. @yordan-pavlov , is the PR decomposable or is not worth the effort trying to split it? -- This is

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #341: Update arrow dependencies again

2021-05-14 Thread GitBox
codecov-commenter commented on pull request #341: URL: https://github.com/apache/arrow-datafusion/pull/341#issuecomment-841449969 #

[GitHub] [arrow] github-actions[bot] commented on pull request #10326: ARROW-12791: [R] Better error handling for DatasetFactory$Finish() when no format specified

2021-05-14 Thread GitBox
github-actions[bot] commented on pull request #10326: URL: https://github.com/apache/arrow/pull/10326#issuecomment-841447425 https://issues.apache.org/jira/browse/ARROW-12791 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-datafusion] alamb merged pull request #337: Implement readable explain plans for physical plans

2021-05-14 Thread GitBox
alamb merged pull request #337: URL: https://github.com/apache/arrow-datafusion/pull/337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

2021-05-14 Thread GitBox
alamb commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632737073 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #328: fix 305 by using a null array as param for zero param functions

2021-05-14 Thread GitBox
alamb commented on a change in pull request #328: URL: https://github.com/apache/arrow-datafusion/pull/328#discussion_r632736152 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1358,6 +1366,18 @@ impl fmt::Display for ScalarFunctionExpr { } } +/// null

[GitHub] [arrow-datafusion] jorgecarleitao commented on a change in pull request #303: add random SQL function

2021-05-14 Thread GitBox
jorgecarleitao commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632735574 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { }

[GitHub] [arrow-datafusion] alamb opened a new pull request #341: Update arrow dependencies again

2021-05-14 Thread GitBox
alamb opened a new pull request #341: URL: https://github.com/apache/arrow-datafusion/pull/341 Until we get regular arrow-rs releases, I want to keep pulling updates from arrow-rs into datafusion frequently and thus manually -- This is an automated message from the Apache Git Service.

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

2021-05-14 Thread GitBox
Dandandan commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632728287 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow-rs] Dandandan commented on issue #200: Use iterators to increase performance of creating Arrow arrays

2021-05-14 Thread GitBox
Dandandan commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-841434097 Cool, thanks for the update @yordan-pavlov . Let's see, a small slowdown at some place can be fine if it's offset by other improvements and/or better code quality or design! --

[GitHub] [arrow-rs] jorgecarleitao commented on pull request #288: manually bump development version

2021-05-14 Thread GitBox
jorgecarleitao commented on pull request #288: URL: https://github.com/apache/arrow-rs/pull/288#issuecomment-841433092 @alamb the integration tests were failing about 8h ago, but all good now afaik. Someone fixed this in apache/arrow :) -- This is an automated message from the Apache

[GitHub] [arrow-rs] alamb commented on pull request #288: manually bump development version

2021-05-14 Thread GitBox
alamb commented on pull request #288: URL: https://github.com/apache/arrow-rs/pull/288#issuecomment-841432376 The integration test failure seems unrelated (though maybe not?) -- https://github.com/apache/arrow-rs/pull/288/checks?check_run_id=2574974207 I retriggered it to see if it

[GitHub] [arrow-rs] alamb closed issue #230: Add support for pretty-printing Decimal numbers

2021-05-14 Thread GitBox
alamb closed issue #230: URL: https://github.com/apache/arrow-rs/issues/230 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-rs] alamb merged pull request #273: Added Decimal support to pretty-print display utility (#230)

2021-05-14 Thread GitBox
alamb merged pull request #273: URL: https://github.com/apache/arrow-rs/pull/273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow-rs] alamb commented on a change in pull request #273: Added Decimal support to pretty-print display utility (#230)

2021-05-14 Thread GitBox
alamb commented on a change in pull request #273: URL: https://github.com/apache/arrow-rs/pull/273#discussion_r632724899 ## File path: arrow/src/util/display.rs ## @@ -217,6 +231,9 @@ pub fn array_value_to_string(column: ::ArrayRef, row: usize) -> Result

[GitHub] [arrow-rs] yordan-pavlov commented on issue #200: Use iterators to increase performance of creating Arrow arrays

2021-05-14 Thread GitBox
yordan-pavlov commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-841429737 @alamb thank you for taking the time to review my benchmark results. I have done some profiling already, and although I haven't spent very long looking into the results, it's

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

2021-05-14 Thread GitBox
alamb commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632722624 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #334: Add window expr

2021-05-14 Thread GitBox
alamb commented on a change in pull request #334: URL: https://github.com/apache/arrow-datafusion/pull/334#discussion_r632720978 ## File path: datafusion/src/logical_plan/builder.rs ## @@ -289,6 +289,14 @@ impl LogicalPlanBuilder { })) } +pub fn window( +

[GitHub] [arrow] thisisnic opened a new pull request #10326: ARROW-12791: [R] Better error handling for DatasetFactory$Finish() when no format specified

2021-05-14 Thread GitBox
thisisnic opened a new pull request #10326: URL: https://github.com/apache/arrow/pull/10326 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this

[GitHub] [arrow-rs] jorgecarleitao commented on a change in pull request #289: Added changelog generator script and configuration.

2021-05-14 Thread GitBox
jorgecarleitao commented on a change in pull request #289: URL: https://github.com/apache/arrow-rs/pull/289#discussion_r632717500 ## File path: change_log.sh ## @@ -0,0 +1,26 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more

[GitHub] [arrow-rs] alamb commented on issue #200: Use iterators to increase performance of creating Arrow arrays

2021-05-14 Thread GitBox
alamb commented on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-841423524 > @Dandandan @jorgecarleitao @alamb Let me know what you think - is the simplification of reading arrow arrays using a single, iterator-based abstraction worth a performance hit in a

[GitHub] [arrow-rs] alamb commented on a change in pull request #289: Added changelog generator script and configuration.

2021-05-14 Thread GitBox
alamb commented on a change in pull request #289: URL: https://github.com/apache/arrow-rs/pull/289#discussion_r632713588 ## File path: change_log.sh ## @@ -0,0 +1,26 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license

[GitHub] [arrow-datafusion] alamb commented on pull request #335: Demonstrate binding NOW() during planning (ALTERNATE TO StatefulFunction)

2021-05-14 Thread GitBox
alamb commented on pull request #335: URL: https://github.com/apache/arrow-datafusion/pull/335#issuecomment-841410480 Incorporated into #288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] alamb closed pull request #335: Demonstrate binding NOW() during planning (ALTERNATE TO StatefulFunction)

2021-05-14 Thread GitBox
alamb closed pull request #335: URL: https://github.com/apache/arrow-datafusion/pull/335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb merged pull request #288: [Datafusion] NOW() function support

2021-05-14 Thread GitBox
alamb merged pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] alamb closed issue #251: Implement Postgres compatible `now()` function

2021-05-14 Thread GitBox
alamb closed issue #251: URL: https://github.com/apache/arrow-datafusion/issues/251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow-datafusion] jorgecarleitao commented on a change in pull request #337: Implement readable explain plans for physical plans

2021-05-14 Thread GitBox
jorgecarleitao commented on a change in pull request #337: URL: https://github.com/apache/arrow-datafusion/pull/337#discussion_r632701908 ## File path: datafusion/src/logical_plan/plan.rs ## @@ -356,13 +356,15 @@ pub enum Partitioning { /// after all children have been

[GitHub] [arrow-datafusion] alamb commented on pull request #288: [Datafusion] NOW() function support

2021-05-14 Thread GitBox
alamb commented on pull request #288: URL: https://github.com/apache/arrow-datafusion/pull/288#issuecomment-841408299 I filed https://github.com/apache/arrow-datafusion/issues/340 to track possible future work on stateful functions -- This is an automated message from the Apache Git

[GitHub] [arrow-datafusion] alamb commented on issue #340: StatefulFunctions

2021-05-14 Thread GitBox
alamb commented on issue #340: URL: https://github.com/apache/arrow-datafusion/issues/340#issuecomment-841408125 My personal take is that adding some way to mark a `ScalarFunction` as being `immutable`, `stable` or `volatile` would be valuable for query optimization (e.g. we could

[GitHub] [arrow-datafusion] alamb opened a new issue #340: StatefulFunctions

2021-05-14 Thread GitBox
alamb opened a new issue #340: URL: https://github.com/apache/arrow-datafusion/issues/340 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** On a PR that added what postgres would term a `stable` function (something that is not the

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #337: Implement readable explain plans for physical plans

2021-05-14 Thread GitBox
codecov-commenter commented on pull request #337: URL: https://github.com/apache/arrow-datafusion/pull/337#issuecomment-841406715 #

  1   2   >