[Numpy-discussion] Re: Converting array to string

2024-02-25 Thread Robert Kern
On Sat, Feb 24, 2024 at 7:17 PM Dom Grigonis  wrote:

> Hello,
>
> I am seeking a bit of help.
>
> I need a fast way to transfer numpy arrays in json format.
>
> Thus, converting array to list is not an option and I need something
> similar to:
>
> a = np.ones(1000)%timeit a.tobytes()17.6 ms
>
> This is quick, fast and portable. In other words I am very happy with
> this, but...
>
> Json does not support bytes.
>
> Any method of subsequent conversion from bytes to string is number of
> times slower than the actual serialisation.
>
> So my question is: Is there any way to serialise directly to string?
>
> I remember there used to be 2 methods: tobytes and tostring. However, I
> see that tostring is deprecated and its functionality is equivalent to
> `tobytes`. Is it completely gone? Or is there a chance I can still find a
> performant version of converting to and back from `str` type of non-human
> readable form?
>

The old `tostring` was actually the same as `tobytes`. In Python 2, the
`str` type was what `bytes` is now, a string of octets. In Python 3, `str`
became a string a Unicode characters (what you want) and the `bytes` type
was introduced for the string of octects so `tostring` was merely _renamed_
to `tobytes` to match. `tostring` never returned a string of Unicode
characters suitable for inclusion in JSON.

AFAICT, there is not likely to be a much more efficient way to convert from
an array to a reasonable JSONable encoding (e.g. base64). The time you are
seeing is the time it takes to encode that amount of data, period. That
said, if you want to use a quite inefficient hex encoding, `a.data.hex()`
is somewhat faster than the base64 encoding, but it's less than ideal in
terms of space usage.

-- 
Robert Kern
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Converting array to string

2024-02-25 Thread Dom Grigonis
Thank you for your answer,

Yeah, I was unsure if it ever existed in the first place.

Space is less of an issue and something like `a.data.hex()` would be fine as 
long as its speed was on par with `a.tobytes()`. However, it is 10x slower on 
my machine.

This e-mail is pretty much a final check (after a fair bit of research and 
attempts) that it can not be done so I can eliminate this possibility as 
feasible and concentrate on other options.

Regards,
DG

> On 25 Feb 2024, at 20:30, Robert Kern  wrote:
> 
> On Sat, Feb 24, 2024 at 7:17 PM Dom Grigonis  > wrote:
> Hello,
> 
> I am seeking a bit of help.
> 
> I need a fast way to transfer numpy arrays in json format.
> 
> Thus, converting array to list is not an option and I need something similar 
> to:
> a = np.ones(1000)
> %timeit a.tobytes()
> 17.6 ms
> This is quick, fast and portable. In other words I am very happy with this, 
> but...
> 
> Json does not support bytes.
> 
> Any method of subsequent conversion from bytes to string is number of times 
> slower than the actual serialisation.
> 
> So my question is: Is there any way to serialise directly to string?
> 
> I remember there used to be 2 methods: tobytes and tostring. However, I see 
> that tostring is deprecated and its functionality is equivalent to `tobytes`. 
> Is it completely gone? Or is there a chance I can still find a performant 
> version of converting to and back from `str` type of non-human readable form?
>  
> The old `tostring` was actually the same as `tobytes`. In Python 2, the `str` 
> type was what `bytes` is now, a string of octets. In Python 3, `str` became a 
> string a Unicode characters (what you want) and the `bytes` type was 
> introduced for the string of octects so `tostring` was merely _renamed_ to 
> `tobytes` to match. `tostring` never returned a string of Unicode characters 
> suitable for inclusion in JSON.
> 
> AFAICT, there is not likely to be a much more efficient way to convert from 
> an array to a reasonable JSONable encoding (e.g. base64). The time you are 
> seeing is the time it takes to encode that amount of data, period. That said, 
> if you want to use a quite inefficient hex encoding, `a.data.hex()` is 
> somewhat faster than the base64 encoding, but it's less than ideal in terms 
> of space usage.
> 
> -- 
> Robert Kern
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: dom.grigo...@gmail.com

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Converting array to string

2024-02-25 Thread Robert Kern
On Sun, Feb 25, 2024 at 1:52 PM Dom Grigonis  wrote:

> Thank you for your answer,
>
> Yeah, I was unsure if it ever existed in the first place.
>
> Space is less of an issue and something like `a.data.hex()` would be fine
> as long as its speed was on par with `a.tobytes()`. However, it is 10x
> slower on my machine.
>
> This e-mail is pretty much a final check (after a fair bit of research and
> attempts) that it can not be done so I can eliminate this possibility as
> feasible and concentrate on other options.
>

I think that mostly runs the gamut for pure JSON. Consider looking at
BJData, but it's "JSON-based" and not quite pure JSON.

https://neurojson.org/

-- 
Robert Kern
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Converting array to string

2024-02-25 Thread Dom Grigonis
Does it have native bytes support? To me, it's either having fast conversion to 
`string` or data format with native bytes support.

Sometimes readability is important, sometimes speed takes priority. Even with a 
good, unified data structure for arrays, indexed arrays, etc., it is always 
good to know that if it becomes a bottleneck I can substitute array value with 
its bytes representation with little to none extra work.

DG

> On 25 Feb 2024, at 21:02, Robert Kern  wrote:
> 
> On Sun, Feb 25, 2024 at 1:52 PM Dom Grigonis  > wrote:
> Thank you for your answer,
> 
> Yeah, I was unsure if it ever existed in the first place.
> 
> Space is less of an issue and something like `a.data.hex()` would be fine as 
> long as its speed was on par with `a.tobytes()`. However, it is 10x slower on 
> my machine.
> 
> This e-mail is pretty much a final check (after a fair bit of research and 
> attempts) that it can not be done so I can eliminate this possibility as 
> feasible and concentrate on other options.
> 
> I think that mostly runs the gamut for pure JSON. Consider looking at BJData, 
> but it's "JSON-based" and not quite pure JSON.
> 
> https://neurojson.org/ 
> 
> -- 
> Robert Kern
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: dom.grigo...@gmail.com

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: next Documentation team meeting

2024-02-25 Thread Mukulika Pahari
Hi all,

Our next Documentation Team meeting will happen on *Monday, February 26* at 
*11PM UTC*. If this time slot is inconvenient for you to join, please let me 
know in the replies or Slack and we will work something out.

All are welcome - you don't need to already be a contributor to join. If you 
have questions or are curious about what we're doing, we'll be happy to meet 
you!

If you wish to join on Zoom, use this (updated) link:
https://numfocus-org.zoom.us/j/85016474448?pwd=TWEvaWJ1SklyVEpwNXUrcHV1YmFJQ...

Here's the permanent hackmd document with the meeting notes (still being
updated):
https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg

Hope to see you around!

Best wishes,
Mukulika
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] next NumPy community meeting - February 28th, 2024 at 6pm UTC

2024-02-25 Thread Inessa Pawson
The next NumPy community meeting will be held this Wednesday, February 28th
at 6pm UTC.
Join us via Zoom:
https://numfocus-org.zoom.us/j/83278611437?pwd=ekhoLzlHRjdWc0NOY2FQM0NPemdkZz09
.
Everyone is welcome and encouraged to attend.
To add to the meeting agenda the topics you’d like to discuss, follow the
link: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both.
For the notes from the previous meetings, visit:
https://github.com/numpy/archive/tree/main/community_meetings.

-- 
Cheers,
Inessa

Inessa Pawson
GitHub: inessapawson
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com