stenlarsson commented on issue #48880:
URL: https://github.com/apache/arrow/issues/48880#issuecomment-3776529312

   From what I can gather the problem is that there is an `Arrow::Buffer` that 
points to data owned by a Ruby string, but the string is destroyed by the 
garbage collector. There is a long chain of references which must be maintained 
for the string to survive. It looks something like this in the below example:
   
   `ExecutePlan` => `ExecuteNode` => `ProjectNodeOptions` => `CallExpression` 
=> `LiteralExpression` => `ScalarDatum` => `StringScalar` => `Buffer` => string
   
   I tried adding all of these classes 
[here](https://github.com/apache/arrow/blob/e78abb9cc3bf07f077b05d1acd97f95045e6d246/ruby/red-arrow/lib/arrow/loader.rb#L45-L64)
 to keep the references, but `ExecutePlan` is not so easy. Have a look at the 
following example:
   
   ```ruby
   require 'bundler/setup'
   require 'arrow'
   
   table = Arrow::Table.new(
     'foo' => [1, 2],
     'bar' => [%w[a b], %w[c d]],
   )
   plan = Arrow::ExecutePlan.new
   node = plan.build_source_node(table)
   node = plan.build_project_node(
     node,
     Arrow::ProjectNodeOptions.new(
       [
         :foo,
         Arrow::CallExpression.new('binary_join', [:bar, ',']),
       ],
       %w[foo bar],
     ),
   )
   puts plan.nodes.map(&:object_id).join(', ')
   puts plan.nodes.map(&:object_id).join(', ')
   GC.start
   puts plan.nodes.map(&:object_id).join(', ')
   puts plan.nodes.map(&:object_id).join(', ')
   ```
   
   When I run this example I get the following output:
   
   ```
   6128, 6136
   6128, 6136
   6128, 6144
   6128, 6144
   ```
   
   It looks like the project-node gets represented by a new Ruby object after 
running the garbage collector. @kou, what could be the cause of this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to