dangotbanned commented on issue #47288:
URL: https://github.com/apache/arrow/issues/47288#issuecomment-3597653670

   We'd have a use for this in https://github.com/narwhals-dev/narwhals
   
   
[`ArrowSeries.sample`](https://github.com/narwhals-dev/narwhals/blob/0309e3f6a927a764efe89faee6b1e6722b1021a0/narwhals/_arrow/series.py#L644-L661)
 currently depends on `numpy` to provide the same functionality found in 
[`polars.Series.sample`](https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.sample.html).
   
   
[`pandas.Series.sample`](https://pandas.pydata.org/docs/reference/api/pandas.Series.sample.html)
 has a pretty similar api too
   
   ### Example
   ```py
   from __future__ import annotations
   
   from typing import Any
   import pyarrow as pa
   
   
   def sample(
       arr: pa.ChunkedArray[Any],
       n: int | None,
       *,
       fraction: float | None,
       with_replacement: bool,
       seed: int | None,
   ) -> pa.ChunkedArray[Any]:
       import numpy as np
   
       num_rows = len(arr)
       if n is None and fraction is not None:
           n = int(num_rows * fraction)
       rng = np.random.default_rng(seed=seed)
       idx = np.arange(num_rows)
       mask = rng.choice(idx, size=n, replace=with_replacement)
       return arr.take(mask)
   
   
   ca = pa.chunked_array([[2, 4, 8, 0]])
   result = sample(ca, n=10, with_replacement=True, fraction=None, seed=None)
   print(result.to_pylist())
   ```
   
   ```
   [8, 4, 4, 8, 0, 0, 8, 4, 4, 8]
   ```
   
   Tbh we should probably be using 
[`pyarrow.arange`](https://arrow.apache.org/docs/dev/python/generated/pyarrow.arange.html)
 now anyway 😄 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to