[capnproto] Text blob too big.

2022-10-31 Thread Josemi
Hello.

I need to work with an structured data that have atributes with undefined 
lenght, some of them could have GB. 

I have been using Protocol Buffers for it until I see that exists a hard 
limit of 2GB per message.   Then I see this Stack Overflow solution 
. 
It says that  Cap'n Proto "can support messages up to 2^64 bytes (2^32 
segments of 4GB each)"

I reprogram the code and now it raise the error  *capnp.lib.capnp.KjException: 
capnp/layout.c++:1694: failed: text blob too big  * trying to set a 0.8 GB 
buffer to a Data type.

On Cap'n Proto the atribute seems like that:
*file @1 :Data; # ptr[1],*

And the code, in Python,  is something like:
file = open('data_file', 'rb').read()


Whats wrong with that. 
Cap'n Proto won't solve my problem?.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/8a23280b-b43b-483a-94db-7fd94bba93een%40googlegroups.com.


Re: [capnproto] Text blob too big.

2022-10-31 Thread 'Kenton Varda' via Cap'n Proto
For a single Text or Data blob there is a hard limit of 512MB. You can,
however, construct a message which contains multiple blobs, e.g. use
`List(Text)`. Such a message can be up to 2^64 bytes.

If I were redesigning the encoding from scratch I'd probably allow for
bigger individual blobs but there's no way to introduce them now without
breaking compatibility, unfortunately.

Protobuf theoretically supports 2GB messages but because the messages have
to be parsed upfront in O(n) time, you won't have good results with
messages of that size. Cap'n Proto, on the other hand, quite comfortably
supports multi-GB messages since you can mmap() the message and randomly
access it in O(1) time.

-Kenton

On Mon, Oct 31, 2022 at 2:21 PM Josemi  wrote:

> Hello.
>
> I need to work with an structured data that have atributes with undefined
> lenght, some of them could have GB.
>
> I have been using Protocol Buffers for it until I see that exists a hard
> limit of 2GB per message.   Then I see this Stack Overflow solution
> .
> It says that  Cap'n Proto "can support messages up to 2^64 bytes (2^32
> segments of 4GB each)"
>
> I reprogram the code and now it raise the error  *capnp.lib.capnp.KjException:
> capnp/layout.c++:1694: failed: text blob too big  * trying to set a 0.8
> GB buffer to a Data type.
>
> On Cap'n Proto the atribute seems like that:
> *file @1 :Data; # ptr[1],*
>
> And the code, in Python,  is something like:
> file = open('data_file', 'rb').read()
>
>
> Whats wrong with that.
> Cap'n Proto won't solve my problem?.
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to capnproto+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/capnproto/8a23280b-b43b-483a-94db-7fd94bba93een%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/CAJouXQk0hybS%2Bc21E7NagDBQqNcCaXkTHWoPsmAT%3DiwNdHjfwg%40mail.gmail.com.


Re: [capnproto] Text blob too big.

2022-10-31 Thread Josemi
Hi Kenton, thanks for the quick response. I was working now on this 
solution, the limit per blob is not too much of a problem for me. 
Thanks to Cap'n proto I think I'll have an elegant solution. Thanks.

El lunes, 31 de octubre de 2022 a las 20:32:30 UTC+1, ken...@cloudflare.com 
escribió:

> For a single Text or Data blob there is a hard limit of 512MB. You can, 
> however, construct a message which contains multiple blobs, e.g. use 
> `List(Text)`. Such a message can be up to 2^64 bytes.
>
> If I were redesigning the encoding from scratch I'd probably allow for 
> bigger individual blobs but there's no way to introduce them now without 
> breaking compatibility, unfortunately.
>
> Protobuf theoretically supports 2GB messages but because the messages have 
> to be parsed upfront in O(n) time, you won't have good results with 
> messages of that size. Cap'n Proto, on the other hand, quite comfortably 
> supports multi-GB messages since you can mmap() the message and randomly 
> access it in O(1) time.
>
> -Kenton
>
> On Mon, Oct 31, 2022 at 2:21 PM Josemi  wrote:
>
>> Hello.
>>
>> I need to work with an structured data that have atributes with undefined 
>> lenght, some of them could have GB. 
>>
>> I have been using Protocol Buffers for it until I see that exists a hard 
>> limit of 2GB per message.   Then I see this Stack Overflow solution 
>> . 
>> It says that  Cap'n Proto "can support messages up to 2^64 bytes (2^32 
>> segments of 4GB each)"
>>
>> I reprogram the code and now it raise the error  
>> *capnp.lib.capnp.KjException: 
>> capnp/layout.c++:1694: failed: text blob too big  * trying to set a 0.8 
>> GB buffer to a Data type.
>>
>> On Cap'n Proto the atribute seems like that:
>> *file @1 :Data; # ptr[1],*
>>
>> And the code, in Python,  is something like:
>> file = open('data_file', 'rb').read()
>>
>>
>> Whats wrong with that. 
>> Cap'n Proto won't solve my problem?.
>>
>> Thanks.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Cap'n Proto" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to capnproto+...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/capnproto/8a23280b-b43b-483a-94db-7fd94bba93een%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/84bd40fa-849f-45e1-8e3d-8256fe563605n%40googlegroups.com.


Re: [capnproto] Text blob too big.

2022-10-31 Thread Jens Alfke


> On Oct 31, 2022, at 12:21 PM, Josemi  wrote:
> 
> Hello.
> 
> I need to work with a structured data that have atributes with undefined 
> lenght, some of them could have GB. 

Most structured storage is optimized for smaller data. And huge values in-line 
push all the records far apart, which is bad for cache performance.

SQLite supports arbitrary size blobs up to 2^64 bytes. (Even with that it’s 
best to put the blobs in a separate table and join your records to it.)

—Jens

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/1DAB34A2-38DA-4159-A46E-10744D6DED80%40mooseyard.com.


Re: [capnproto] Text blob too big.

2022-11-02 Thread Josemi
El problema es que el servidor recibe todos los datos en un búfer de 
transmisión, pieza por pieza sin conocer el objeto estructurado. Así que lo 
estoy almacenando en un archivo porque el servidor no necesita abrirlo 
(solo necesita leer datos pequeños). Si quiero poner todos los atributos en 
las tablas SQL, tendré que tomar todos los fragmentos y construir el objeto 
estructurado en la memoria, y esto, según tengo entendido, es un problema.
No sé si mis ideas son correctas o me estoy perdiendo algo.

-- Josemi.

El martes, 1 de noviembre de 2022 a las 3:45:12 UTC+1, jens@gmail.com 
escribió:

>
> > On Oct 31, 2022, at 12:21 PM, Josemi  wrote:
> > 
> > Hello.
> > 
> > I need to work with a structured data that have atributes with undefined 
> lenght, some of them could have GB. 
>
> Most structured storage is optimized for smaller data. And huge values 
> in-line push all the records far apart, which is bad for cache performance.
>
> SQLite supports arbitrary size blobs up to 2^64 bytes. (Even with that 
> it’s best to put the blobs in a separate table and join your records to it.)
>
> —Jens
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/b5ada546-b0a2-4160-84b7-71b89b2e01e4n%40googlegroups.com.


Re: [capnproto] Text blob too big.

2022-11-03 Thread Jose mi
Sorry I writed it in Spanish and see it now.

Translation:

The problem is that the server receives all the data in a transmission 
buffer, piece by piece without knowing the structured object. So I'm 
storing it in a file because the server doesn't need to open it (it only 
needs to read small data). If I want to put all the attributes in the SQL 
tables, I will have to take all the fragments and build the structured 
object in memory, and this, as I understand it, is a problem.
I don't know if my ideas are correct or am I missing something.


El miércoles, 2 de noviembre de 2022 a las 18:22:04 UTC+1, Jose mi escribió:

> El problema es que el servidor recibe todos los datos en un búfer de 
> transmisión, pieza por pieza sin conocer el objeto estructurado. Así que lo 
> estoy almacenando en un archivo porque el servidor no necesita abrirlo 
> (solo necesita leer datos pequeños). Si quiero poner todos los atributos en 
> las tablas SQL, tendré que tomar todos los fragmentos y construir el objeto 
> estructurado en la memoria, y esto, según tengo entendido, es un problema.
> No sé si mis ideas son correctas o me estoy perdiendo algo.
>
> -- Josemi.
>
> El martes, 1 de noviembre de 2022 a las 3:45:12 UTC+1, jens@gmail.com 
> escribió:
>
>>
>> > On Oct 31, 2022, at 12:21 PM, Josemi  wrote: 
>> > 
>> > Hello. 
>> > 
>> > I need to work with a structured data that have atributes with 
>> undefined lenght, some of them could have GB. 
>>
>> Most structured storage is optimized for smaller data. And huge values 
>> in-line push all the records far apart, which is bad for cache performance. 
>>
>> SQLite supports arbitrary size blobs up to 2^64 bytes. (Even with that 
>> it’s best to put the blobs in a separate table and join your records to 
>> it.) 
>>
>> —Jens 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/103e3505-2ced-4a70-8801-6caccddfd825n%40googlegroups.com.


Re: [capnproto] Text blob too big.

2023-03-17 Thread Jonathan Shapiro
Amusingly, I was just looking at this for a protocol.  Kenton's suggestion 
to use List[Data] works, but it carries an unintended buffering problem. A 
message can have multiple segments, and a correct capn-proto implementation 
will extend its segment pool as needed to hold a large byte sequence, but 
if I read the encoding spec correctly, it cannot *send* any of the segments 
until *all* of the segments are available for framing. Which means that 
your big blob of data sits in client memory while you are loading it, and 
stays there until the message is fully transmitted . Segment release could 
be optimized by releasing segments as transmission proceeds, but that isn't 
required by the capn-proto specification, and it doesn't resolve the "load 
big blob into memory" problem.

One alternative is to introduce a builder pattern, something like this:

interface FileBuilder {
  write @0 (d : Data) -> Int32; # Returns number of bytes written
  close @1 () -> (file: File)
}

interface MyService {
  createFile @0 () -> (builder; FileBuilder)
  myInterestingCall @1 (... file: File, ...) -> (val: InterestingResult)
}

The advantage to this, mainly, is that the byte transmission is divided 
into a *sequence* of messages. On the service side, these can be stashed 
aside until the Close() call is made, at which point the file object is 
fabricated on the service and a File capability is returned to the client 
that can be included in other messages. Because of promise pipelining, this 
doesn't take as many round trips as you might think.

The main puzzle here, in my mind, is that the server needs to know when 
client-side capabilities. I don't remember seeing anything like a 
capability release protocol that advises the serving entity when the state 
associated with an ephemeral object can be released.

On Thursday, November 3, 2022 at 10:05:56 AM UTC-7 josem...@gmail.com wrote:

> Sorry I writed it in Spanish and see it now.
>
> Translation:
>
> The problem is that the server receives all the data in a transmission 
> buffer, piece by piece without knowing the structured object. So I'm 
> storing it in a file because the server doesn't need to open it (it only 
> needs to read small data). If I want to put all the attributes in the SQL 
> tables, I will have to take all the fragments and build the structured 
> object in memory, and this, as I understand it, is a problem.
> I don't know if my ideas are correct or am I missing something.
>
>
> El miércoles, 2 de noviembre de 2022 a las 18:22:04 UTC+1, Jose mi 
> escribió:
>
>> El problema es que el servidor recibe todos los datos en un búfer de 
>> transmisión, pieza por pieza sin conocer el objeto estructurado. Así que lo 
>> estoy almacenando en un archivo porque el servidor no necesita abrirlo 
>> (solo necesita leer datos pequeños). Si quiero poner todos los atributos en 
>> las tablas SQL, tendré que tomar todos los fragmentos y construir el objeto 
>> estructurado en la memoria, y esto, según tengo entendido, es un problema.
>> No sé si mis ideas son correctas o me estoy perdiendo algo.
>>
>> -- Josemi.
>>
>> El martes, 1 de noviembre de 2022 a las 3:45:12 UTC+1, jens@gmail.com 
>> escribió:
>>
>>>
>>> > On Oct 31, 2022, at 12:21 PM, Josemi  wrote: 
>>> > 
>>> > Hello. 
>>> > 
>>> > I need to work with a structured data that have atributes with 
>>> undefined lenght, some of them could have GB. 
>>>
>>> Most structured storage is optimized for smaller data. And huge values 
>>> in-line push all the records far apart, which is bad for cache performance. 
>>>
>>> SQLite supports arbitrary size blobs up to 2^64 bytes. (Even with that 
>>> it’s best to put the blobs in a separate table and join your records to 
>>> it.) 
>>>
>>> —Jens 
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/7c56f2fc-8e73-4f61-91ce-c07917ed670cn%40googlegroups.com.


Re: [capnproto] Text blob too big.

2023-03-17 Thread Jonathan Shapiro
I obviously need to finish reading specs before I ask stupid questions. 
rpc.capnp documents the release message quite clearly.

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/124b8ee8-2952-4416-9c2d-bcf8c285c02fn%40googlegroups.com.