jorgecarleitao commented on a change in pull request #8630:
URL: https://github.com/apache/arrow/pull/8630#discussion_r523390298
##
File path: rust/arrow/benches/filter_kernels.rs
##
@@ -14,137 +14,136 @@
// KIND, either express or implied. See the License for the
// specifi
jorgecarleitao closed pull request #8645:
URL: https://github.com/apache/arrow/pull/8645
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
nevi-me commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523384710
##
File path: rust/datafusion/src/physical_plan/mod.rs
##
@@ -100,6 +100,30 @@ pub enum Distribution {
SinglePartition,
}
+/// Represents the result
jorgecarleitao commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523374868
##
File path: rust/datafusion/src/physical_plan/filter.rs
##
@@ -24,7 +24,7 @@ use std::sync::{Arc, Mutex};
use crate::error::{ExecutionError, Resul
jorgecarleitao commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523374604
##
File path: rust/datafusion/src/physical_plan/mod.rs
##
@@ -100,6 +100,30 @@ pub enum Distribution {
SinglePartition,
}
+/// Represents the
jorgecarleitao commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523374868
##
File path: rust/datafusion/src/physical_plan/filter.rs
##
@@ -24,7 +24,7 @@ use std::sync::{Arc, Mutex};
use crate::error::{ExecutionError, Resul
jorgecarleitao commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523374604
##
File path: rust/datafusion/src/physical_plan/mod.rs
##
@@ -100,6 +100,30 @@ pub enum Distribution {
SinglePartition,
}
+/// Represents the
nevi-me commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523368564
##
File path: rust/datafusion/src/physical_plan/mod.rs
##
@@ -100,6 +100,30 @@ pub enum Distribution {
SinglePartition,
}
+/// Represents the result
arw2019 commented on a change in pull request #8474:
URL: https://github.com/apache/arrow/pull/8474#discussion_r523365931
##
File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc
##
@@ -151,6 +151,45 @@ std::unique_ptr MinMaxInit(KernelContext*
ctx, const KernelInitArgs
arw2019 commented on pull request #8474:
URL: https://github.com/apache/arrow/pull/8474#issuecomment-727132950
> Since this is similar to #8294, and most review on the code happened
there, does it makes sense to get that PR merged first?
Yes, agreed - since this follows the pattern i
emkornfield commented on pull request #8644:
URL: https://github.com/apache/arrow/pull/8644#issuecomment-727073502
I'm -1 on allowing non-conforming IPC implementations.
This is an automated message from the Apache Git Servic
wesm commented on pull request #8644:
URL: https://github.com/apache/arrow/pull/8644#issuecomment-727068271
From the specification
> Implementations are recommended to allocate memory on aligned addresses
(multiple of 8- or 64-bytes) and pad (overallocate) to a length that is a
mult
andygrove commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523264105
##
File path: rust/datafusion/src/physical_plan/expressions.rs
##
@@ -1288,18 +1363,84 @@ impl PhysicalExpr for BinaryExpr {
Ok(self.left.nullabl
github-actions[bot] commented on pull request #8661:
URL: https://github.com/apache/arrow/pull/8661#issuecomment-727064616
https://issues.apache.org/jira/browse/ARROW-10581
This is an automated message from the Apache Git Ser
njwhite commented on pull request #8644:
URL: https://github.com/apache/arrow/pull/8644#issuecomment-727063708
@wesm I disagree with your assertion that it's only useful in an
extraordinarily narrow use case - I've added a test case
`test_contiguous_buffers_mixed_types` to show a zero-copy
github-actions[bot] commented on pull request #8661:
URL: https://github.com/apache/arrow/pull/8661#issuecomment-727050993
Thanks for opening a pull request!
Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW
Then could
carols10cents commented on pull request #8641:
URL: https://github.com/apache/arrow/pull/8641#issuecomment-727048909
Ok, so, now the integration test job got cancelled after 360 min, and
suspiciously it appears to be cancelled [during the Flight
tests](https://github.com/apache/arrow/pull/
yordan-pavlov commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523245452
##
File path: rust/datafusion/src/physical_plan/expressions.rs
##
@@ -969,6 +975,42 @@ macro_rules! compute_utf8_op {
}};
}
+/// Invoke a comp
Fonsan opened a new pull request #8661:
URL: https://github.com/apache/arrow/pull/8661
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
yordan-pavlov commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523243341
##
File path: rust/datafusion/src/physical_plan/mod.rs
##
@@ -100,6 +100,30 @@ pub enum Distribution {
SinglePartition,
}
+/// Represents the
yordan-pavlov commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523241911
##
File path: rust/datafusion/src/physical_plan/expressions.rs
##
@@ -1288,18 +1363,84 @@ impl PhysicalExpr for BinaryExpr {
Ok(self.left.nul
andygrove commented on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-727043583
@Dandandan Thanks for the links. That addresses my concern. We do have a
benchmark crate in this repo with instructions for running a TPC-H with larger
data sets but I don't see
andygrove commented on pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#issuecomment-727041322
@yordan-pavlov I took a quick skim through and this is looking really good!
Could you rebase?
This is an autom
andygrove commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523238084
##
File path: rust/datafusion/src/physical_plan/mod.rs
##
@@ -100,6 +100,30 @@ pub enum Distribution {
SinglePartition,
}
+/// Represents the resu
andygrove commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523237664
##
File path: rust/datafusion/src/physical_plan/expressions.rs
##
@@ -969,6 +975,42 @@ macro_rules! compute_utf8_op {
}};
}
+/// Invoke a compute
andygrove commented on a change in pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#discussion_r523237229
##
File path: rust/datafusion/src/physical_plan/expressions.rs
##
@@ -1288,18 +1363,84 @@ impl PhysicalExpr for BinaryExpr {
Ok(self.left.nullabl
github-actions[bot] commented on pull request #8660:
URL: https://github.com/apache/arrow/pull/8660#issuecomment-727004972
https://issues.apache.org/jira/browse/ARROW-10173
This is an automated message from the Apache Git Ser
yordan-pavlov opened a new pull request #8660:
URL: https://github.com/apache/arrow/pull/8660
This PR addresses the inefficient comparison to scalar values, where an
array is built with the scalar value repeated, by
changing the return value of expressions from `Result` to
`Result` wh
BryanCutler commented on pull request #8057:
URL: https://github.com/apache/arrow/pull/8057#issuecomment-726960913
merged to master
This is an automated message from the Apache Git Service.
To respond to the message, please l
BryanCutler closed pull request #8057:
URL: https://github.com/apache/arrow/pull/8057
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
BryanCutler commented on a change in pull request #8057:
URL: https://github.com/apache/arrow/pull/8057#discussion_r523152926
##
File path:
java/memory/memory-netty/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java
##
@@ -60,9 +59,6 @@
private UnsafeDirectLittle
arw2019 commented on pull request #8657:
URL: https://github.com/apache/arrow/pull/8657#issuecomment-726943228
> I don't know if there is interest in having a C++
`ChunkedArray::CombineChunks()` method as well (similarly as there is a
`Table::CombineChunks`), but that can also be added lat
arw2019 commented on a change in pull request #8657:
URL: https://github.com/apache/arrow/pull/8657#discussion_r523124148
##
File path: docs/source/python/api/tables.rst
##
@@ -29,6 +29,7 @@ Factory Functions
:toctree: ../generated/
chunked_array
+ combine_chunks
R
alamb commented on pull request #8553:
URL: https://github.com/apache/arrow/pull/8553#issuecomment-726900488
I plan to merge this tomorrow unless i hear otherwise. @jorgecarleitao /
@andygrove let me know if you have any concerns
-
rdettai commented on pull request #8553:
URL: https://github.com/apache/arrow/pull/8553#issuecomment-726895657
I had some code that was crashing because of the behavior aggregate had when
the wrapped exec first returned `Pending` when being polled. It know works
perfectly with this PR! Tha
rdettai commented on pull request #8658:
URL: https://github.com/apache/arrow/pull/8658#issuecomment-726879119
Well, it is great that I did not start fixing the problem and I first
focused on building a test that pinpointed the issue. Long live the TDD! 😄
I'll rebase this as soon as
alamb closed pull request #8567:
URL: https://github.com/apache/arrow/pull/8567
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
alamb commented on pull request #8658:
URL: https://github.com/apache/arrow/pull/8658#issuecomment-726864487
I will try and review it later today or tomorrow morning (UTC+5) time.
This is an automated message from the Apache
alamb commented on pull request #8658:
URL: https://github.com/apache/arrow/pull/8658#issuecomment-726864283
https://github.com/apache/arrow/pull/8553 maybe be also related -- but it is
a much more invasive change than this PR
--
github-actions[bot] commented on pull request #8659:
URL: https://github.com/apache/arrow/pull/8659#issuecomment-726863521
https://issues.apache.org/jira/browse/ARROW-10480
This is an automated message from the Apache Git Ser
lidavidm opened a new pull request #8659:
URL: https://github.com/apache/arrow/pull/8659
While files like "foo.parquet.gz" are nonstandard, we nonetheless shouldn't
autodetect compression due to such a naming scheme.
This is
Dandandan edited a comment on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726843795
A nice overview is listed here:
https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md comparing
aHash to other algorithms. aHash passes the full suite of
htt
arw2019 commented on a change in pull request #8294:
URL: https://github.com/apache/arrow/pull/8294#discussion_r523059162
##
File path: cpp/src/arrow/compute/api_aggregate.cc
##
@@ -41,8 +41,12 @@ Result MinMax(const Datum& value, const
MinMaxOptions& options, ExecConte
ret
Dandandan edited a comment on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726843795
A nice overview is listed here:
https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md comparing
aHash to other algorithms. aHash passes the full suit of
http
Dandandan commented on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726843795
A nice overview is listed here:
https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md comparing
aHash to other algorithms. aHash passes the full suit of
https://git
andygrove commented on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726839597
Do we have a feel for the performance implications of this change for large
data sets as opposed to the micro benchmarks?
--
kszucs commented on pull request #8567:
URL: https://github.com/apache/arrow/pull/8567#issuecomment-726838793
@alamb nope, it's good to go.
This is an automated message from the Apache Git Service.
To respond to the message,
github-actions[bot] commented on pull request #8658:
URL: https://github.com/apache/arrow/pull/8658#issuecomment-726825009
https://issues.apache.org/jira/browse/ARROW-10577
This is an automated message from the Apache Git Ser
rdettai edited a comment on pull request #8658:
URL: https://github.com/apache/arrow/pull/8658#issuecomment-726822182
I initially considered creating a first class citizen `YieldingExec`
(https://gist.github.com/rdettai/c2045be688d457cb346c41e8769ed5d8), but as it
will likely only be used
rdettai commented on pull request #8658:
URL: https://github.com/apache/arrow/pull/8658#issuecomment-726822182
I initially considered creating a first class citizen `YieldingExec`
(https://gist.github.com/rdettai/c2045be688d457cb346c41e8769ed5d8), but as it
will likely only be used in the
rdettai opened a new pull request #8658:
URL: https://github.com/apache/arrow/pull/8658
> This happens when executing a DataFusion query plan with hash aggregation
where the data source is not ready on the first call by the Executor, and the
async state machine is passed to a pending state
kszucs commented on pull request #8650:
URL: https://github.com/apache/arrow/pull/8650#issuecomment-726818274
cc @bkietz since we co-authored the python-side refactor
This is an automated message from the Apache Git Service.
andygrove commented on pull request #8283:
URL: https://github.com/apache/arrow/pull/8283#issuecomment-726811184
@alamb That is a good question. I have been too busy at work lately to work
on Arrow/DataFusion/Ballista but I have been spending some time contemplating
where to go next.
bkietz commented on pull request #8472:
URL: https://github.com/apache/arrow/pull/8472#issuecomment-726799854
@pitrou yes but not soon
This is an automated message from the Apache Git Service.
To respond to the message, pleas
bkietz closed pull request #8652:
URL: https://github.com/apache/arrow/pull/8652
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
bkietz commented on a change in pull request #8652:
URL: https://github.com/apache/arrow/pull/8652#discussion_r522984668
##
File path: cpp/src/arrow/array/validate.cc
##
@@ -392,96 +376,159 @@ Status ValidateArray(const Array& array) {
type.ToString(
romainfrancois edited a comment on pull request #8650:
URL: https://github.com/apache/arrow/pull/8650#issuecomment-726789546
``` r
library(arrow, warn.conflicts = FALSE)
arrow:::vec_to_arrow(1:2, int32())
#> Array
#>
#> [
#> 1,
#> 2
#> ]
arrow:::vec_to_arr
paddyhoran commented on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726793208
> I prefer using t1ha than ahash, which proven to be sound.
The `t1ha` crate is `Licensed under zlib License`. I don't think that is
compatible with Apache (`ahash` is).
romainfrancois commented on pull request #8650:
URL: https://github.com/apache/arrow/pull/8650#issuecomment-726789546
``` r
library(arrow, warn.conflicts = FALSE)
arrow:::vec_to_arrow(1:2, int32())
#> Array
#>
#> [
#> 1,
#> 2
#> ]
arrow:::vec_to_arrow(c(1,
romainfrancois commented on pull request #8650:
URL: https://github.com/apache/arrow/pull/8650#issuecomment-726775206
Thanks @kszucs for the direct help. This is very far from done, but it's a
start, and perhaps we can resume the conversation here.
AFAIK, There is no R equivalent to
jorisvandenbossche closed pull request #8212:
URL: https://github.com/apache/arrow/pull/8212
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
jorisvandenbossche commented on pull request #8212:
URL: https://github.com/apache/arrow/pull/8212#issuecomment-726769806
Since the Java implementation of Parquet has LZO, I suppose that's a good
enough confirmation ;)
This
vertexclique edited a comment on pull request #8598:
URL: https://github.com/apache/arrow/pull/8598#issuecomment-726767196
Yes, it won't until that method rewritten using the bit-slice iterator :)
written here: https://github.com/apache/arrow/pull/8645#issuecomment-725957761
p.s: totally
jorisvandenbossche commented on a change in pull request #8474:
URL: https://github.com/apache/arrow/pull/8474#discussion_r522953360
##
File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc
##
@@ -151,6 +151,45 @@ std::unique_ptr MinMaxInit(KernelContext*
ctx, const Kern
vertexclique edited a comment on pull request #8598:
URL: https://github.com/apache/arrow/pull/8598#issuecomment-726767196
Yes it won't until that method rewritten using the bit slice iterator :)
p.s: totally unrelated topic, how do you run Valgrind on mac?
-
vertexclique commented on pull request #8598:
URL: https://github.com/apache/arrow/pull/8598#issuecomment-726767196
Yes it won't until that method rewritten using the bit slice iterator :)
This is an automated message from th
vertexclique commented on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726764889
I prefer using t1ha than ahash, which proven to be sound.
This is an automated message from the Apache Git Se
alamb closed pull request #8656:
URL: https://github.com/apache/arrow/pull/8656
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
paddyhoran commented on pull request #8656:
URL: https://github.com/apache/arrow/pull/8656#issuecomment-726763619
> The
https://github.com/apache/arrow/pull/8656/checks?check_run_id=1394039681 build
on travis seems to have been queued for many hours at this point. I am thinking
that mergi
alamb commented on pull request #8654:
URL: https://github.com/apache/arrow/pull/8654#issuecomment-726760544
There appears to be a diff int he CI tests:
https://github.com/apache/arrow/pull/8654/checks?check_run_id=1392391221
```
execution::context::tests::count_distin
alamb commented on pull request #8656:
URL: https://github.com/apache/arrow/pull/8656#issuecomment-726759319
The https://github.com/apache/arrow/pull/8656/checks?check_run_id=1394039681
build on travis seems to have been queued for many hours at this point. I am
thinking that merging this
jorisvandenbossche commented on a change in pull request #8294:
URL: https://github.com/apache/arrow/pull/8294#discussion_r522942123
##
File path: cpp/src/arrow/compute/api_aggregate.cc
##
@@ -41,8 +41,12 @@ Result MinMax(const Datum& value, const
MinMaxOptions& options, ExecC
alamb commented on pull request #8567:
URL: https://github.com/apache/arrow/pull/8567#issuecomment-726757507
@jorgecarleitao / @nevi-me / @kszucs is there any outstanding work for
this PR or shall we merge it in?
This is a
jorisvandenbossche commented on a change in pull request #8294:
URL: https://github.com/apache/arrow/pull/8294#discussion_r522941736
##
File path: cpp/src/arrow/compute/api_aggregate.h
##
@@ -154,7 +154,21 @@ Result MinMax(const Datum& value,
const MinMaxO
jorgecarleitao edited a comment on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726756578
Thanks a lot, @alamb , really useful data points ❤️
For me that is enough of a reason: fix UB with `safe` code, and figure out a
way to perform multi-bit assig
jorgecarleitao commented on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726756578
Thanks a lot, @alamb , really useful data points ❤️
For me that is enough of a reason: fix UB with `safe` code, and figure out a
way to perform multi-bit assignment o
alamb commented on pull request #8283:
URL: https://github.com/apache/arrow/pull/8283#issuecomment-726756633
@andygrove I wonder what, if anything, you plan to do with this PR now
This is an automated message from the Apache
alamb commented on a change in pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#discussion_r522935301
##
File path: rust/arrow/src/util/bit_util.rs
##
@@ -99,36 +99,6 @@ pub unsafe fn unset_bit_raw(data: *mut u8, i: usize) {
*data.add(i >> 3) ^= BIT_MASK[
alamb commented on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726749635
FWIW I ran the code in #8598 under valgrind and it does not appear to fix
the issue https://github.com/apache/arrow/pull/8598#issuecomment-726749085
---
alamb edited a comment on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726749635
FWIW I ran the code in #8598 under valgrind and it does not appear to fix
the issue -- see details in
https://github.com/apache/arrow/pull/8598#issuecomment-726749085
-
alamb commented on pull request #8598:
URL: https://github.com/apache/arrow/pull/8598#issuecomment-726749085
Some additional data: I ran the tests under valgrind (as described in
https://github.com/apache/arrow/pull/8645#issuecomment-726736494) on this
branch after rebasing against master.
maartenbreddels commented on pull request #8628:
URL: https://github.com/apache/arrow/pull/8628#issuecomment-726748561
@pitrou i think this is ready to go/review.
This is an automated message from the Apache Git Service.
To r
alamb commented on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726742048
> (if you be so kind, could you quickly run fd75933 , just to test whether
this PR addresses the issue?)
@jorgecarleitao -- I did so. There are no errors reported by valgrind
jorgecarleitao commented on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726737738
(if you be so kind, could you quickly run fd75933 , just to test whether
this PR addresses the issue?)
Thi
alamb commented on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726736494
In terms of evidence that there is a problem on master, I ran the arrow test
suite under `valgind` @ 30516049522c1a527ffb375e7790102f58edb4f9 on master and
it does flag an invalid r
jorisvandenbossche commented on a change in pull request #8657:
URL: https://github.com/apache/arrow/pull/8657#discussion_r522913647
##
File path: python/pyarrow/tests/test_array.py
##
@@ -2643,6 +2643,15 @@ def test_concat_array_invalid_type():
pa.concat_arrays(arr)
alamb commented on pull request #8645:
URL: https://github.com/apache/arrow/pull/8645#issuecomment-726724609
I am going to take another hard look at
https://github.com/apache/arrow/pull/8598 and see if we can get enough
consensus to get it merged
-
alamb commented on a change in pull request #8656:
URL: https://github.com/apache/arrow/pull/8656#discussion_r522898144
##
File path: rust/arrow/src/array/mod.rs
##
@@ -121,8 +121,8 @@ pub use self::array_primitive::PrimitiveArray;
pub use self::array_string::LargeStringArray;
liyafan82 commented on a change in pull request #8210:
URL: https://github.com/apache/arrow/pull/8210#discussion_r522824604
##
File path: java/performance/pom.xml
##
@@ -169,10 +173,17 @@
${benchmark.filter}
-f
bkietz closed pull request #8582:
URL: https://github.com/apache/arrow/pull/8582
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
chr1st1ank commented on issue #8607:
URL: https://github.com/apache/arrow/issues/8607#issuecomment-726609910
This can be reproduced with the following commands in ipython.
In effect the attempt to write to a file without write permissions to it
results in the deletion of this file (of co
91 matches
Mail list logo