[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-27 Thread sschepens
Github user sschepens commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
Isn't Bookkeeper max entry size 5mb?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-27 Thread sschepens
Github user sschepens commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
Isn't Bookkeeper max entry size 5mb?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread rdhabalia
Github user rdhabalia commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> The configurable max message size could be a separate from this

I will create a separate issue to provide a way to configure this size and 
will create a separate PR for both of them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread rdhabalia
Github user rdhabalia commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> The configurable max message size could be a separate from this

I will create a separate issue to provide a way to configure this size and 
will create a separate PR for both of them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> If there is a system(Pulsar) limit, and if the system can honor a 
"violating" request by optimizing the request, it should. 

Ok, I agree with that. The configurable max message size could be a 
separate from this, though something that on its own we should do at some point.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> If there is a system(Pulsar) limit, and if the system can honor a 
"violating" request by optimizing the request, it should. 

Ok, I agree with that. The configurable max message size could be a 
separate from this, though something that on its own we should do at some point.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread joefk
Github user joefk commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
If there needs to be a hard limit (for 100% assurance), the application can 
enforce that itself without any changes in the system (Pulsar).

My perspective is from what I think as the origin of this ask:   If there 
is a system(Pulsar)  limit, and if the system can honor a "violating" request 
by optimizing the request, it should.  Agreed that sometimes the system will 
not always succeed in this, but it should try.  That could reduce user impact 
by reducing the number of violations, and some use-cases might prefer that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread joefk
Github user joefk commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
If there needs to be a hard limit (for 100% assurance), the application can 
enforce that itself without any changes in the system (Pulsar).

My perspective is from what I think as the origin of this ask:   If there 
is a system(Pulsar)  limit, and if the system can honor a "violating" request 
by optimizing the request, it should.  Agreed that sometimes the system will 
not always succeed in this, but it should try.  That could reduce user impact 
by reducing the number of violations, and some use-cases might prefer that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
I am not saying we cant, just trying to scope all the implications :)

If you check the size after compression you're not sure whether a 
particular message will go through or not.

Example:
I have a 6MB JSON document. Most probably it will compress to 600KB, but I 
cannot tell 100%. If the JSON is weird enough it might be 5.5MB. 

So if I send this 6MB message it might get rejected by the client.

Typically the safe-size buffer for compression is _bigger_ that the 
uncompressed content, to account for incompressible content plus the compressor 
overhead.

In my view, compressing is an optimization to safe network/IO/disk space, 
but the max size should be enforced irrespectively of that. So if you have 6MB 
or 10MB entries, you should be able to configure the system to handle that, 
with 100% assurance that it will processed correctly, instead of relying on the 
assumption that compression will solve it, which might be true in 99.99% of the 
cases but fail at unexpected times.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
I am not saying we cant, just trying to scope all the implications :)

If you check the size after compression you're not sure whether a 
particular message will go through or not.

Example:
I have a 6MB JSON document. Most probably it will compress to 600KB, but I 
cannot tell 100%. If the JSON is weird enough it might be 5.5MB. 

So if I send this 6MB message it might get rejected by the client.

Typically the safe-size buffer for compression is _bigger_ that the 
uncompressed content, to account for incompressible content plus the compressor 
overhead.

In my view, compressing is an optimization to safe network/IO/disk space, 
but the max size should be enforced irrespectively of that. So if you have 6MB 
or 10MB entries, you should be able to configure the system to handle that, 
with 100% assurance that it will processed correctly, instead of relying on the 
assumption that compression will solve it, which might be true in 99.99% of the 
cases but fail at unexpected times.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread joefk
Github user joefk commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
@merlimat Why can't we check size post-compression? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread joefk
Github user joefk commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
@merlimat Why can't we check size post-compression? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread Matteo Merli
> I am curious about SLA vs. message size. If Pulsar has a 5ms SLA is that
going to be feasible if there are very large messages?

Of course it's not feasible :)

The numbers published were related to 1KB entries. Of course there can be a
huge number of different use cases with different trades off. Typically
most users don't care about latency of bigger messages, so we never really
tested it, but at least at lower transfer rate, the latency shouldn't be
too bad even for 1MB messages.



--
Matteo Merli


On Mon, Jun 26, 2017 at 1:42 PM, Sahaya Andrews 
wrote:

> As the message size grows, we won't be able to guarantee 5ms latency
> since the underlying storage itself will take longer to sync the data.
>
> Andrews.
>
> On Mon, Jun 26, 2017 at 1:31 PM, Dave Fisher 
> wrote:
> > Hi -
> >
> > If you are negotiating the buffer size then you should also negotiate if
> this is a compressed or uncompressed size.
> >
> > I am curious about SLA vs. message size. If Pulsar has a 5ms SLA is that
> going to be feasible if there are very large messages?
> >
> > Regards,
> > Dave
> >
> >> On Jun 26, 2017, at 1:26 PM, merlimat  wrote:
> >>
> >> Github user merlimat commented on the issue:
> >>
> >>https://github.com/apache/incubator-pulsar/issues/523
> >>
> >>> If it's configurable then client may start sending msg with
> unreasonable size which will be rejected at broker due to
> frame-size/storage-limitation, So, client will not know up-front.
> >>
> >>Max-frame size could be negotiated between client and broker in
> Connect/Connected handshake
> >>
> >>
> >> ---
> >> If your project is set up for it, you can reply to this email and have
> your
> >> reply appear on GitHub as well. If your project does not have this
> feature
> >> enabled and wishes so, or if the feature is enabled but not working,
> please
> >> contact infrastructure at infrastruct...@apache.org or file a JIRA
> ticket
> >> with INFRA.
> >> ---
> >
>


Re: [GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread Sahaya Andrews
As the message size grows, we won't be able to guarantee 5ms latency
since the underlying storage itself will take longer to sync the data.

Andrews.

On Mon, Jun 26, 2017 at 1:31 PM, Dave Fisher  wrote:
> Hi -
>
> If you are negotiating the buffer size then you should also negotiate if this 
> is a compressed or uncompressed size.
>
> I am curious about SLA vs. message size. If Pulsar has a 5ms SLA is that 
> going to be feasible if there are very large messages?
>
> Regards,
> Dave
>
>> On Jun 26, 2017, at 1:26 PM, merlimat  wrote:
>>
>> Github user merlimat commented on the issue:
>>
>>https://github.com/apache/incubator-pulsar/issues/523
>>
>>> If it's configurable then client may start sending msg with unreasonable 
>>> size which will be rejected at broker due to frame-size/storage-limitation, 
>>> So, client will not know up-front.
>>
>>Max-frame size could be negotiated between client and broker in 
>> Connect/Connected handshake
>>
>>
>> ---
>> If your project is set up for it, you can reply to this email and have your
>> reply appear on GitHub as well. If your project does not have this feature
>> enabled and wishes so, or if the feature is enabled but not working, please
>> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
>> with INFRA.
>> ---
>


Re: [GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread Dave Fisher
Hi -

If you are negotiating the buffer size then you should also negotiate if this 
is a compressed or uncompressed size.

I am curious about SLA vs. message size. If Pulsar has a 5ms SLA is that going 
to be feasible if there are very large messages?

Regards,
Dave

> On Jun 26, 2017, at 1:26 PM, merlimat  wrote:
> 
> Github user merlimat commented on the issue:
> 
>https://github.com/apache/incubator-pulsar/issues/523
> 
>> If it's configurable then client may start sending msg with unreasonable 
>> size which will be rejected at broker due to frame-size/storage-limitation, 
>> So, client will not know up-front.
> 
>Max-frame size could be negotiated between client and broker in 
> Connect/Connected handshake
> 
> 
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---



signature.asc
Description: Message signed with OpenPGP


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> If it's configurable then client may start sending msg with unreasonable 
size which will be rejected at broker due to frame-size/storage-limitation, So, 
client will not know up-front.

Max-frame size could be negotiated between client and broker in 
Connect/Connected handshake


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> If it's configurable then client may start sending msg with unreasonable 
size which will be rejected at broker due to frame-size/storage-limitation, So, 
client will not know up-front.

Max-frame size could be negotiated between client and broker in 
Connect/Connected handshake


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread rdhabalia
Github user rdhabalia commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> One other thing we could do is to make the Max to be configurable, since 
the 5MB was very arbitrary anyway.

If it's configurable then client may start sending msg with unreasonable 
size which will be rejected at broker due to  
[frame-size/storage-limitation](https://github.com/apache/incubator-pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/api/PulsarDecoder.java#L60),
 So, client will not know up-front.

> If you can configure it on broker and client, you can raise and be 100% 
sure it gets accepted

If it's configurable at client then value might be different than what 
broker is allowing and it requires some type of sync when client connects to 
broker.? Also, it may not address payload which can be highly compressible 
(eg.xml,json)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread rdhabalia
Github user rdhabalia commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
> One other thing we could do is to make the Max to be configurable, since 
the 5MB was very arbitrary anyway.

If it's configurable then client may start sending msg with unreasonable 
size which will be rejected at broker due to  
[frame-size/storage-limitation](https://github.com/apache/incubator-pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/api/PulsarDecoder.java#L60),
 So, client will not know up-front.

> If you can configure it on broker and client, you can raise and be 100% 
sure it gets accepted

If it's configurable at client then value might be different than what 
broker is allowing and it requires some type of sync when client connects to 
broker.? Also, it may not address payload which can be highly compressible 
(eg.xml,json)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
My only concern is that it might be tricky to know upfront whether the 
message would be accepted or rejected (If you send msg >5MB). 

One other thing we could do is to make the Max to be configurable, since 
the 5MB was very arbitrary anyway.

If you can configure it on broker and client, you can raise and be 100% 
sure it gets accepted, even though compression didn't reduce the size for a 
particular message.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread merlimat
Github user merlimat commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
My only concern is that it might be tricky to know upfront whether the 
message would be accepted or rejected (If you send msg >5MB). 

One other thing we could do is to make the Max to be configurable, since 
the 5MB was very arbitrary anyway.

If you can configure it on broker and client, you can raise and be 100% 
sure it gets accepted, even though compression didn't reduce the size for a 
particular message.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread rdhabalia
Github user rdhabalia commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
yes, right now, client considers uncompressed message size while sending 
message to broker and I don't see any clear reason why can't we consider 
compressed-message size instead, because broker stores compressed data only to 
to BK which will be < 5MB only. So, I think we can add validation after 
compressing the payload.
@merlimat do you see any issue for considering compressed size while 
sending msg to broker.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-26 Thread rdhabalia
Github user rdhabalia commented on the issue:

https://github.com/apache/incubator-pulsar/issues/523
  
yes, right now, client considers uncompressed message size while sending 
message to broker and I don't see any clear reason why can't we consider 
compressed-message size instead, because broker stores compressed data only to 
to BK which will be < 5MB only. So, I think we can add validation after 
compressing the payload.
@merlimat do you see any issue for considering compressed size while 
sending msg to broker.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-pulsar issue #523: Max Payload size is validated before compressio...

2017-06-23 Thread sschepens
GitHub user sschepens opened an issue:

https://github.com/apache/incubator-pulsar/issues/523

Max Payload size is validated before compression

Max Message Payload size is validated by `MessageBuilderImpl` and that 
happens before producer gets to compress payload.
This should be validated after compression to allow big payload that are 
highly compressible (json) to be sent.







---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---