"David Li" writes:
> Thanks for the clarification Yibo, looking forward to the results. Even if it
> is a very hacky PoC it will be interesting to see how it affects performance,
> though as Keith points out there are benefits in general to UCX (or similar
> library), and we can work out the
Jorge Cardoso Leitão writes:
> Yes, I expect aligned SIMD loads to be faster.
>
> My understanding is that we do not need an alignment requirement for this,
> though: split the buffer in 3, [unaligned][aligned][unaligned], use aligned
> loads for the middle and un-aligned (or not even SIMD) for
Yibo Cai writes:
> HPC infrastructure normally leverages RDMA for fast data transfer among
> storage nodes and compute nodes. Computation tasks are dispatched to
> compute nodes with best fit resources.
>
> Concretely, we are investigating porting UCX as Flight transport layer.
> UCX is a
Andy Grove writes:
>
> Looking at this purely from the DataFusion/Ballista point of view, what I
> would be interested in would be having a branch of DF that uses arrow2 and
> once that branch has all tests passing and can run queries with performance
> that is at least as good as the original
Andy Grove writes:
> We started looking at the documentation for git filter-branch and it
> recommends not to use it. It states that "git-filter-branch is riddled with
> gotchas resulting in various ways to easily corrupt repos or end up with a
> mess worse than what you started with:".
I've
Wes McKinney writes:
> I think we should take a more serious look at Buildkite for some of our CI.
>
> * First of all, it's very easy to connect self-hosted workers and
> supports ephemeral cloud workers in a way that would be difficult or
> impossible with GHA. No need to have Infra fiddle with
I'm interested in providing some path to make this extensible. To pick an
example, suppose the user wants to compute the first k principle components.
We've talked [1] about the possibility of incorporating richer communication
semantics in Ballista (a la MPI sub-communicators) and numerical
"Du, Frank" writes:
> The PR I committed provide a basic support for runtime dispatching. I
> agree that complier should generate good vectorize for the non-null
> data part but in fact it didn't, jedbrown point to it can force
> complier to SIMD using some additional pragmas, something like
>
I'd just like to chime in with the use case of in-situ data analysis for
simulations. This domain tends to be cautious with dependencies and
there is a lot of C and Fortran, but the in-situ analysis tools will
preferably reside in separate processes while sharing memory via shared
memory
Wes McKinney writes:
> The abstract/all-virtual base has some benefits:
>
> * No need to implement "forwarding" methods to the private implementation
> * Do not have to declare "friend" classes in the header for some cases
> where other classes need to access the methods of a private
>
Sutou Kouhei writes:
> How about creating a mirror repository on
> https://gitlab.com/ only to run CI jobs?
>
> This is an idea that is described in
> https://issues.apache.org/jira/browse/ARROW-5673 .
>
> GitLab CI can attach external workers. So we can increase CI
> capacity by adding our new
"Malakhov, Anton" writes:
> Jed,
>
>> From: Jed Brown [mailto:j...@jedbrown.org]
>> Sent: Friday, May 3, 2019 12:41
>
>> You linked to a NumPy discussion
>> (https://github.com/numpy/numpy/issues/11826) that is encountering the same
>> is
"Malakhov, Anton" writes:
>> > the library creates threads internally. It's a disaster for managing
>> > oversubscription and affinity issues among groups of threads and/or
>> > multiple processes (e.g., MPI).
>
> This is exactly what I'm talking about referring as issues with threading
>
Antoine Pitrou writes:
> Hi Jed,
>
> Le 03/05/2019 à 05:47, Jed Brown a écrit :
>> I would caution to please not commit to the MKL/BLAS model in which the
>> library creates threads internally. It's a disaster for managing
>> oversubscription and affinity iss
I would caution to please not commit to the MKL/BLAS model in which the
library creates threads internally. It's a disaster for managing
oversubscription and affinity issues among groups of threads and/or
multiple processes (e.g., MPI). For example, a composable OpenMP
technique is for the
Kenta Murata writes:
> Hi Jed,
>
> I'd like to describe the current status of the implementation of SparseTensor.
> I hope the following explanation will help you.
>
> First of all, I designed the current SparseTensor format as the first
> interim implementation.
> At this time I used
relation to Arrow I don't understand (could be an explicit
non-goal for all I know).
Wes McKinney writes:
> hi Jed,
>
> Would you like to submit a pull request to propose the changes or
> additions you are escribing?
>
> Thanks
> Wes
>
> On Sat, Mar 9, 2019 at 11:32 PM
Wes asked me to bring this discussion here. I'm a developer of PETSc
and, with Arrow is getting into the sparse representation space, would
like for it to interoperate as well as possible.
1. Please guarantee support for 64-bit offsets and indices. The current
spec uses "long", which is 32-bit
18 matches
Mail list logo