Dear Apache Arrow Dev Community,

My name is Bechir, I am currently working on a project that involves
implementing graph algorithms in Apache Arrow.

The initial plan was to construct a node structure and a subsequent graph
that would encompass all the nodes. However, I quickly realized that due to
Apache Arrow's columnar format, this approach was not feasible.

I tried a couple of things, including the implementation of the
shortest-path algorithm. However, I rapidly discovered that manipulating
arrow objects, particularly when applying graph algorithms, proved more
complex than anticipated and it became very clear that I would need to
resort to some data structures outside of what arrow offers (i.e.: Heapq
wouldn't be possible using arrow).

I also gave a shot at doing it similar to a certain SQL method (see:
https://ibb.co/0rPGB42 ), but ran into some roadblocks there too and I
ended up having to resort to using Pandas for some transformations.

My next course of action is to experiment with compressed sparse rows,
hoping to execute Matrix Multiplication using this method. But honestly,
with what I know right now, I remain skeptical about the feasibility
of it. However,
before committing to this approach, I would greatly appreciate your opinion
based on your experience with Apache Arrow.

Thank you very much for your time.

Looking forward to potentially discussing this further.

Many thanks,
Bechir

Reply via email to