Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-10-04 Thread Martin Sustrik
Hi,

Ok. You can do that. Install a SIGABRT handler which will get called in 
the case of assertion.

However, keep in mind that assertion basically indicates a byzantine 
failure, ie. you have no guarantees about the state of the process 
whatsoever.

For example, the process memory may be overwritten. If you try, say, to 
save the data to the database it may happen that you'll overwrite the 
current consistent, although a bit stale, data by utter junk.

Martin

On 10/03/2011 01:23 AM, Elliot Saba wrote:

> I second this notion, it allows for much more graceful error handling,
> even in the case of errors that "should never happen".  This makes the
> error checking the asserts do much more meaningful for users, and not
> just meaningful for zmq developers.
> -E
>
> On Sun, Oct 2, 2011 at 11:04 AM, Ian Barber  > wrote:
>
> On Sat, Oct 1, 2011 at 9:58 AM, Martin Sustrik  > wrote:
>
>  > What can be done is setting a global handler function that will be
>  > called if a bug is hit. It's not clear what the application should do
>  > then though. Maybe it can save its state and restart itself?
>  >
>
> This would be a very good option I think. Like you say, I think the
> main advantage would be to allow orderly shutdown of other parts of an
> application.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-10-02 Thread Elliot Saba
I second this notion, it allows for much more graceful error handling, even
in the case of errors that "should never happen".  This makes the error
checking the asserts do much more meaningful for users, and not just
meaningful for zmq developers.
-E

On Sun, Oct 2, 2011 at 11:04 AM, Ian Barber  wrote:

> On Sat, Oct 1, 2011 at 9:58 AM, Martin Sustrik  wrote:
>
> > What can be done is setting a global handler function that will be
> > called if a bug is hit. It's not clear what the application should do
> > then though. Maybe it can save its state and restart itself?
> >
>
> This would be a very good option I think. Like you say, I think the
> main advantage would be to allow orderly shutdown of other parts of an
> application.
>
> Ian
> ___
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-10-02 Thread Ian Barber
On Sat, Oct 1, 2011 at 9:58 AM, Martin Sustrik  wrote:

> What can be done is setting a global handler function that will be
> called if a bug is hit. It's not clear what the application should do
> then though. Maybe it can save its state and restart itself?
>

This would be a very good option I think. Like you say, I think the
main advantage would be to allow orderly shutdown of other parts of an
application.

Ian
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-10-01 Thread Martin Sustrik
Hi Zed,

> I was asked to report all asserts I encounter.  I went to the JIRA to
> submit this as a bug, but it looks like I have to create an account,
> or something, can't really figure it out even though I've used JIRA
> before.  I'm guessing this is the next place to report a bug, so here
> you go.

As Mikko says the bug is already reported.

> On another note:  Causing a full assert abort in *my* program from
> *your* library because of a little hicup in an external resource is
> stupid.

This is not a hiccup in resource. It looks more like a synchronisation 
issue.

>  I've been saying for close to a year now that *all* of the
> zmq_asserts need to go away.

Asserts check for bugs. To get rid of them we have to fix bugs. The 
other option is to ignore the bugs and allow 0MQ to continue operating 
in a broken state. That's OK as far as you are happy with undefined 
behaviour.

If you really want that I can add a compile time option to ignore all 
the asserts. It's just few lines of code, so let me know.

> libzmq needs to return valid error codes
> and stop aborting *my* servers.  Until they're gone completely I can't
> trust that some random socket error I have no control over won't abort
> my whole world. And, having to troll through C++ code to debug why I'm
> getting the error is annoying.

The errors can happen asynchronously.

What can be done is setting a global handler function that will be 
called if a bug is hit. It's not clear what the application should do 
then though. Maybe it can save its state and restart itself?

> At a minimum, add a 3rd parameter that
> gives an error message that's other than something like "No such file
> or directory".

The asserts can be enhanced by longer messages, like, in this case, 
"kqueue have returned an unexpected error: no such file or directory". I 
am not sure how helpful will that be though.

> WTF does that even mean for kqueue?  I sure as hell
> didn't do anything to cause that. How could I possibly fix that?

[ENOENT] The event could not be found to be modified or deleted.

What's happening, I guess, is that an event is referenced that was 
already removed from the kqueue.

In any case, I have no OSX system to reproduce the problem. If anyone 
bother to give me remote access, I can try to fix it.

Martin
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-10-01 Thread Elliot Saba
I have created a small test case that produces this error, please see the
JIRA issue.

It occurs when I create a ROUTER socket, send a message, then disconnect,
many many times over again.
-E

On Fri, Sep 30, 2011 at 6:47 PM, Mikko Koppanen wrote:

> On Fri, Sep 30, 2011 at 11:22 PM, Zed Shaw  wrote:
> > I was asked to report all asserts I encounter.  I went to the JIRA to
> > submit this as a bug, but it looks like I have to create an account,
> > or something, can't really figure it out even though I've used JIRA
> > before.  I'm guessing this is the next place to report a bug, so here
> > you go.
> >
> > If I run ZeroMQ 2.1.9 for even a reasonably complex load on OSX Lion I
> get this:
> >
> > No such file or directory
> > rc != -1 (kqueue.cpp:76)
> > Abort trap: 6
> >
> > I'll just report it here and then maybe someone can fix this.
>
> Hi Zed,
>
> I think I ran into the same issue:
> https://zeromq.jira.com/browse/LIBZMQ-261. This usually happened to me
> if the other peer disconnected. I haven't seen this issue after
> applying the patch mentioned in the issue.
>
>
> --
> Mikko Koppanen
> ___
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-09-30 Thread Mikko Koppanen
On Fri, Sep 30, 2011 at 11:22 PM, Zed Shaw  wrote:
> I was asked to report all asserts I encounter.  I went to the JIRA to
> submit this as a bug, but it looks like I have to create an account,
> or something, can't really figure it out even though I've used JIRA
> before.  I'm guessing this is the next place to report a bug, so here
> you go.
>
> If I run ZeroMQ 2.1.9 for even a reasonably complex load on OSX Lion I get 
> this:
>
> No such file or directory
> rc != -1 (kqueue.cpp:76)
> Abort trap: 6
>
> I'll just report it here and then maybe someone can fix this.

Hi Zed,

I think I ran into the same issue:
https://zeromq.jira.com/browse/LIBZMQ-261. This usually happened to me
if the other peer disconnected. I haven't seen this issue after
applying the patch mentioned in the issue.


-- 
Mikko Koppanen
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-09-30 Thread Steven McCoy
On 30 September 2011 18:22, Zed Shaw  wrote:

> 
> And, having to troll through C++ code to debug why I'm
> getting the error is annoying.  At a minimum, add a 3rd parameter that
> gives an error message that's other than something like "No such file
> or directory".  WTF does that even mean for kqueue?  I sure as hell
> didn't do anything to cause that. How could I possibly fix that?
> 
>
>
How is the OpenPGM method of error handling?  I followed the GLib route
because a single error code is just annoying and tedious, but you don't want
to add too much overhead and unnecessary confusion as it does add to the
learning curve.


typedef struct {
  intdomain;
  intcode;
  char*  message;
} pgm_error_t;

pgm_error_t* err = NULL;


if (!pgm_getaddrinfo (network, NULL, &res, &err)) {

   fprintf (stderr, "Parsing network parameter: %s\n", (err &&
err->message) ? err->message : "(null)");

   pgm_error_free (err);
...
}


-- 
Steve-o
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


[zeromq-dev] [BUG] zmq_assert causes BOOM if you breath on OSX Lion kqueue wrong.

2011-09-30 Thread Zed Shaw
I was asked to report all asserts I encounter.  I went to the JIRA to
submit this as a bug, but it looks like I have to create an account,
or something, can't really figure it out even though I've used JIRA
before.  I'm guessing this is the next place to report a bug, so here
you go.

If I run ZeroMQ 2.1.9 for even a reasonably complex load on OSX Lion I get this:

No such file or directory
rc != -1 (kqueue.cpp:76)
Abort trap: 6

I'll just report it here and then maybe someone can fix this.


On another note:  Causing a full assert abort in *my* program from
*your* library because of a little hicup in an external resource is
stupid.  I've been saying for close to a year now that *all* of the
zmq_asserts need to go away.  libzmq needs to return valid error codes
and stop aborting *my* servers.  Until they're gone completely I can't
trust that some random socket error I have no control over won't abort
my whole world. And, having to troll through C++ code to debug why I'm
getting the error is annoying.  At a minimum, add a 3rd parameter that
gives an error message that's other than something like "No such file
or directory".  WTF does that even mean for kqueue?  I sure as hell
didn't do anything to cause that. How could I possibly fix that?


Zed
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev