Perl threads and libwww wierdness

2011-12-13 Thread Toby Wintermute
Hi,
I'm hitting some really odd behaviour, infrequently, with libwww and
mechanize under a highly-threaded Perl.
Can I get a quick check to see if what I'm doing is known to work
reliably for you?


I have encountered a situation where I see unusual 404 errors - in
between 0.03% to 0.10% of requests.
Errors are randomly spaced on random pages, but over time the average
amounts are quite consistent.
Error rates initially increase with the number of simultaneous
threads, but seem to top off at .1%. (ie. One in a thousand requests)

The 404 errors are reported on the distant webserver as well, for URLs
that are definitely not 404. (as the identical URL is being requested
successfully many times in the same period).

Scale: This is typically running around 40 threads, all going flat-out
on an 8-core system; issues show up whenever you get over ~6 threads
though.

The only reason I don't think this is a problem with the network or
webserver is that the problems don't show up if I use fork() instead
of threads. (On otherwise identical code; and the same overall
throughput rates are reached. However the fork() version is just for
that bit of code for testing this; it misses some functionality.)

It's also being a pain to try and replicate the threaded issue with a
standalone server away from our code though, which isn't a good sign.

Hence, looking for some confirmation of whether this might just be a
known-bug with Perl or libwww before I go chasing down this rabbit
hole for miles. :(

This was running on Perl 5.14.1 and current versions of the above
modules. (I don't think there are any bugfixes listed in 5.14.2 that
would affect this issue? [1])

Any thoughts?

Thanks!
Toby

[1 http://search.cpan.org/~flora/perl-5.14.2/pod/perldelta.pod ]

-- 
Turning and turning in the widening gyre
The falcon cannot hear the falconer
Things fall apart; the center cannot hold
Mere anarchy is loosed upon the world


Re: Perl threads and libwww wierdness

2011-12-13 Thread Toby Wintermute
On 14 December 2011 12:16, Toby Wintermute  wrote:
> Hi,
> I'm hitting some really odd behaviour, infrequently, with libwww and
> mechanize under a highly-threaded Perl.


Worth noting - I'm pretty sure LWP is all pure-perl, but mechanize
calls some XS libraries - HTML::Parser, PullParser and TokeParser.

TC


Re: Perl threads and libwww wierdness

2011-12-14 Thread Leo Lapworth
I still think it's your webserver

On 14 December 2011 01:16, Toby Wintermute  wrote:
> I'm hitting some really odd behaviour, infrequently, with libwww and
> mechanize under a highly-threaded Perl.

> I have encountered a situation where I see unusual 404 errors - in
> between 0.03% to 0.10% of requests.
> Errors are randomly spaced on random pages
>
> The 404 errors are reported on the distant webserver as well, for URLs
> that are definitely not 404. (as the identical URL is being requested
> successfully many times in the same period).

My logic would be, if the webserver is reporting intermittent 404's for a
specific URL, then it's the webserver that's generating it.

> Scale: This is typically running around 40 threads, all going flat-out
> on an 8-core system; issues show up whenever you get over ~6 threads
> though.

So the webserver can't cope with the traffic generated when you use
more than 6 threads, so too many requests.

> The only reason I don't think this is a problem with the network or
> webserver is that the problems don't show up if I use fork() instead
> of threads. (On otherwise identical code; and the same overall
> throughput rates are reached. However the fork() version is just for
> that bit of code for testing this; it misses some functionality.)

Does the fork() submit the same number of requests in the same
time period? - if it's less than your thread version then that
would point to the webserver, not the Perl code.

I'd look into if your webserver (you don't mention what software it
is, apache/starman/IIS?) has some sort of "a, I can't cope,
throw a 404" setting or bug.

Good luck!

Leo


Re: Perl threads and libwww wierdness

2011-12-14 Thread Toby Wintermute
On 14 December 2011 19:44, Leo Lapworth  wrote:
> I still think it's your webserver

See below..

> On 14 December 2011 01:16, Toby Wintermute  wrote:
>> I'm hitting some really odd behaviour, infrequently, with libwww and
>> mechanize under a highly-threaded Perl.
>
>> I have encountered a situation where I see unusual 404 errors - in
>> between 0.03% to 0.10% of requests.
>> Errors are randomly spaced on random pages
>>
>> The 404 errors are reported on the distant webserver as well, for URLs
>> that are definitely not 404. (as the identical URL is being requested
>> successfully many times in the same period).
>
> My logic would be, if the webserver is reporting intermittent 404's for a
> specific URL, then it's the webserver that's generating it.

That would be my logic too, but (a) I can't work out why it would drop
one in a thousand requests only, and (b) the non-threaded Perl scripts
don't see ANY 404 errors.

>> Scale: This is typically running around 40 threads, all going flat-out
>> on an 8-core system; issues show up whenever you get over ~6 threads
>> though.
>
> So the webserver can't cope with the traffic generated when you use
> more than 6 threads, so too many requests.

I start seeing errors appear before the webserver is getting fully
loaded up though. The machine serving the requests is quite powerful,
and there's only one machine making all the requests.

>> The only reason I don't think this is a problem with the network or
>> webserver is that the problems don't show up if I use fork() instead
>> of threads. (On otherwise identical code; and the same overall
>> throughput rates are reached. However the fork() version is just for
>> that bit of code for testing this; it misses some functionality.)
>
> Does the fork() submit the same number of requests in the same
> time period? - if it's less than your thread version then that
> would point to the webserver, not the Perl code.

That's the thing -- the fork-based version will generate just as many
hits, but with zero 404 errors.
So the webserver can handle that level of traffic.

Eg. I can pull more than 260/sec off the webserver with either fork or
thread based app. But the thread-based app starts seeing errors creep
in once it gets over around 180-200/sec, and by 260/sec, it's up to
0.1% of all requests. (ie. once every few seconds)

> I'd look into if your webserver (you don't mention what software it
> is, apache/starman/IIS?) has some sort of "a, I can't cope,
> throw a 404" setting or bug.

We've now replicated the issue on both Apache 2.2 and nginx, and also
against an older machine which coped with less overall hitrate, but
still similar error rates.

I'm just annoyed I haven't managed to replicate the issue on
stand-alone code I can demonstrate on a desktop class machine just yet
:/

Thanks,
Toby

-- 
Turning and turning in the widening gyre
The falcon cannot hear the falconer
Things fall apart; the center cannot hold
Mere anarchy is loosed upon the world


Re: Perl threads and libwww wierdness

2011-12-14 Thread Rudolf Lippan
On Tuesday, December 13, 2011 at 08:16:04 PM, Toby Wintermute wrote:
> Hi,

Hi

> The 404 errors are reported on the distant webserver as well, for URLs
> that are definitely not 404. (as the identical URL is being requested
> successfully many times in the same period).
> 

The server should know why it is giving 404s. If it does not wish to share
with you...

> 
> The only reason I don't think this is a problem with the network or
> webserver is that the problems don't show up if I use fork() instead
> of threads. (On otherwise identical code; and the same overall
> throughput rates are reached. However the fork() version is just for
> that bit of code for testing this; it misses some functionality.)
> 

Just a small change performance characteristics can make a big difference
when you are pushing things.

I recently had a case where I optimized the network writes of a program and it
caused some major confusion at the firewall(statefull).  The driver program
had one core pegged, so there was no change in average throughput.


> It's also being a pain to try and replicate the threaded issue with a
> standalone server away from our code though, which isn't a good sign.

If you can, mirror a port on your switch and do a packet capture. That should
give you exactly what is going over the wire, and you don't have to trust
either system to tell you the truth.

-r






Re: Perl threads and libwww wierdness

2011-12-14 Thread Yitzchak Scott-Thoennes
Does looking at the response headers or content of the 404s show
anything of interest?

If you have control of the webserver, on a 404, verify that the
request looks correct, too.


Re: Perl threads and libwww wierdness

2011-12-15 Thread Peter Vereshagin
Hello.

2011/12/14 12:16:04 +1100 Toby Wintermute  => To London.pm 
Perl M[ou]ngers :
TW> I'm hitting some really odd behaviour, infrequently, with libwww and
TW> mechanize under a highly-threaded Perl.

TW> Error rates initially increase with the number of simultaneous
TW> threads, but seem to top off at .1%. (ie. One in a thousand requests)

(i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
production environments.

Forks are the standard IPC method for perl5. Event-driven stuff (AE/EV) is a
more advanced way, but for 40 parallel requests the forks are just ok.

E. g., I used ms-windows native threads behind the scenes visible as fork()s
from perl5.

--
Peter Vereshagin  (http://vereshagin.org) pgp: A0E26627 


Re: Perl threads and libwww wierdness

2011-12-15 Thread Toby Wintermute
2011/12/16 Peter Vereshagin :
> Hello.
>
> 2011/12/14 12:16:04 +1100 Toby Wintermute  => To 
> London.pm Perl M[ou]ngers :
> TW> I'm hitting some really odd behaviour, infrequently, with libwww and
> TW> mechanize under a highly-threaded Perl.
>
> TW> Error rates initially increase with the number of simultaneous
> TW> threads, but seem to top off at .1%. (ie. One in a thousand requests)
>
> (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
> production environments.

I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
but I thought things had improved since then..

> Forks are the standard IPC method for perl5. Event-driven stuff (AE/EV) is a
> more advanced way, but for 40 parallel requests the forks are just ok.

I'm not sure IPC means what you think it means - or I'm
misunderstanding you - It's Inter-process communication, right?
It'd say that forking, threading, and event-driven, are all methods of
doing simultaneous processing - but don't specify anything about IPC.

Which is the reason I was using threads - it's easy to do IPC between
them by sharing access to some data structures.
Whereas with forking you need to set up pipes, posix shared memory,
sockets, or some other means.

In the end, I bit the bullet yesterday and did rewrite the code to use
fork(), and then used Net::STOMP::Client with RabbitMQ to perform the
IPC.

Total throughput is actually slightly faster than the threaded
version, it uses quite a bit less memory, and critically, doesn't get
any spurious errors.

-Toby

-- 
Turning and turning in the widening gyre
The falcon cannot hear the falconer
Things fall apart; the center cannot hold
Mere anarchy is loosed upon the world


Re: Perl threads and libwww wierdness

2011-12-15 Thread Peter Vereshagin
Hello.

2011/12/16 12:38:16 +1100 Toby Wintermute  => To London.pm 
Perl M[ou]ngers :
TW> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
TW> > production environments.
TW> 
TW> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
TW> but I thought things had improved since then..

They did. Perl6 was released and it seems to have threads those can be 
recommended.
Perl5 have fork() that 'just works' and seems to be enough.

TW> > Forks are the standard IPC method for perl5. Event-driven stuff (AE/EV) 
is a
TW> > more advanced way, but for 40 parallel requests the forks are just ok.
TW> 
TW> I'm not sure IPC means what you think it means - or I'm
TW> misunderstanding you - It's Inter-process communication, right?
TW> It'd say that forking, threading, and event-driven, are all methods of
TW> doing simultaneous processing - but don't specify anything about IPC.


Right. I just use to avoid telling many words like 'method of doing
simultaneous processing' in favor of 'perlipc' in this case because:

$ man perlipc | grep fork | wc -l
  37
$ man perlipc | grep thread | wc -l
   4

and having several perl processes to communicate between is made just easier
by mean of fork().

TW> Which is the reason I was using threads - it's easy to do IPC between

... processes? (=

TW> them by sharing access to some data structures.
TW> Whereas with forking you need to set up pipes, posix shared memory,
TW> sockets, or some other means.

Memory-mapped file doesn't seem to be bad:

https://metacpan.org/module/IPC::MMA

TW> In the end, I bit the bullet yesterday and did rewrite the code to use
TW> fork(), and then used Net::STOMP::Client with RabbitMQ to perform the
TW> IPC.

Ouch. Isn't *MQ about to 'lose data under some circumstances' ?

TW> Total throughput is actually slightly faster than the threaded
TW> version, it uses quite a bit less memory, and critically, doesn't get
TW> any spurious errors.

That's it about getting a life with perl5: use forks, be happy. (=

ps. should like to know if there is a threads.pm implementation that means
'native threads' on ms-windows.

--
Peter Vereshagin  (http://vereshagin.org) pgp: A0E26627 


Re: Perl threads and libwww wierdness

2011-12-17 Thread Yitzchak Scott-Thoennes
On Thu, Dec 15, 2011 at 5:38 PM, Toby Wintermute  wrote:
> Which is the reason I was using threads - it's easy to do IPC between
> them by sharing access to some data structures.
> Whereas with forking you need to set up pipes, posix shared memory,
> sockets, or some other means.
>
> In the end, I bit the bullet yesterday and did rewrite the code to use
> fork(), and then used Net::STOMP::Client with RabbitMQ to perform the
> IPC.

If all you need is to return results of a discrete bit of work to the
parent, Parallel::ForkManager makes it trivial.


Re: Perl threads and libwww wierdness

2011-12-17 Thread Dave Hodgkinson

On 18 Dec 2011, at 02:57, Yitzchak Scott-Thoennes wrote:

> On Thu, Dec 15, 2011 at 5:38 PM, Toby Wintermute  wrote:
>> Which is the reason I was using threads - it's easy to do IPC between
>> them by sharing access to some data structures.
>> Whereas with forking you need to set up pipes, posix shared memory,
>> sockets, or some other means.
>> 
>> In the end, I bit the bullet yesterday and did rewrite the code to use
>> fork(), and then used Net::STOMP::Client with RabbitMQ to perform the
>> IPC.
> 
> If all you need is to return results of a discrete bit of work to the
> parent, Parallel::ForkManager makes it trivial.

And the best way to return those results...?


Re: Perl threads and libwww wierdness

2011-12-18 Thread Ruud H.G. van Tol

On 2011-12-18 08:43, Dave Hodgkinson wrote:

On 18 Dec 2011, at 02:57, Yitzchak Scott-Thoennes wrote:



If all you need is to return results of a discrete bit of work to the
parent, Parallel::ForkManager makes it trivial.


And the best way to return those results...?


Unless the exit code is information enough:
I like to call it "merge results", and involve a database.

--
Ruud


Re: Perl threads and libwww wierdness

2011-12-18 Thread Yitzchak Scott-Thoennes
On Sat, Dec 17, 2011 at 11:43 PM, Dave Hodgkinson  wrote:
> On 18 Dec 2011, at 02:57, Yitzchak Scott-Thoennes wrote:
>> If all you need is to return results of a discrete bit of work to the
>> parent, Parallel::ForkManager makes it trivial.
>
> And the best way to return those results...?

$pfm->finish($exit_code, $data_structure_reference).

(Requires P::FM 0.7.6 or later.)


Re: Perl threads and libwww wierdness

2011-12-18 Thread Toby Wintermute
2011/12/16 Peter Vereshagin :
> Hello.
>
> 2011/12/16 12:38:16 +1100 Toby Wintermute  => To 
> London.pm Perl M[ou]ngers :
> TW> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
> TW> > production environments.
> TW>
> TW> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
> TW> but I thought things had improved since then..
>
> They did. Perl6 was released and it seems to have threads those can be 
> recommended.
> Perl5 have fork() that 'just works' and seems to be enough.

So, you're saying that threads under perl5 is forever going to be
considered broken and not worth touching then? :(

Why do all the main distros ship a threading-enabled Perl? I note that
Padre won't build without threads enabled either.


> TW> > Forks are the standard IPC method for perl5. Event-driven stuff (AE/EV) 
> is a
> TW> > more advanced way, but for 40 parallel requests the forks are just ok.
> TW>
> TW> I'm not sure IPC means what you think it means - or I'm
> TW> misunderstanding you - It's Inter-process communication, right?
> TW> It'd say that forking, threading, and event-driven, are all methods of
> TW> doing simultaneous processing - but don't specify anything about IPC.
>
>
> Right. I just use to avoid telling many words like 'method of doing
> simultaneous processing' in favor of 'perlipc' in this case because:
>
>    $ man perlipc | grep fork | wc -l
>          37
>    $ man perlipc | grep thread | wc -l
>           4
>
> and having several perl processes to communicate between is made just easier
> by mean of fork().

How is fork() making it easier to communicate between processes? On
the contrary, I say that using threads::shared makes it far more
easier than fork() and rolling your own IPC.


> TW> Which is the reason I was using threads - it's easy to do IPC between
>
> ... processes? (=
>
> TW> them by sharing access to some data structures.
> TW> Whereas with forking you need to set up pipes, posix shared memory,
> TW> sockets, or some other means.
>
> Memory-mapped file doesn't seem to be bad:
>
>    https://metacpan.org/module/IPC::MMA

As far as I can tell, that doesn't offer any way to perform
synchronisation/condvars or suchlike - ie. where you can control the
flow of one thread from another.


> TW> In the end, I bit the bullet yesterday and did rewrite the code to use
> TW> fork(), and then used Net::STOMP::Client with RabbitMQ to perform the
> TW> IPC.
>
> Ouch. Isn't *MQ about to 'lose data under some circumstances' ?

No, not really. What makes you think that?
It depends entirely on how you configure and use it - you can have
messages you want it to make sure you really, really don't lose - or
you can have messages that are more transient.

-Toby

-- 
Turning and turning in the widening gyre
The falcon cannot hear the falconer
Things fall apart; the center cannot hold
Mere anarchy is loosed upon the world



Re: Perl threads and libwww wierdness

2011-12-19 Thread Yitzchak Scott-Thoennes
On Sun, Dec 18, 2011 at 11:31 PM, Toby Wintermute  wrote:
> 2011/12/16 Peter Vereshagin :
>> 2011/12/16 12:38:16 +1100 Toby Wintermute  => To 
>> London.pm Perl M[ou]ngers :
>> TW> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
>> TW> > production environments.
>> TW>
>> TW> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
>> TW> but I thought things had improved since then..
>>
>> They did. Perl6 was released and it seems to have threads those can be 
>> recommended.
>> Perl5 have fork() that 'just works' and seems to be enough.
>
> So, you're saying that threads under perl5 is forever going to be
> considered broken and not worth touching then? :(
>
> Why do all the main distros ship a threading-enabled Perl? I note that
> Padre won't build without threads enabled either.

Because threads are cool and all the cool languages have them?

My take is that because perl's ithreads are so expensive (in time
and memory both to start them up and to share data), you are almost
always better off designing your code to work with forks instead.
I just don't see the compelling use case for threads, unless you are
stuck in the threading mindset.


Re: Perl threads and libwww wierdness

2011-12-19 Thread Dave Hodgkinson

On 18 Dec 2011, at 14:48, Ruud H.G. van Tol wrote:

> On 2011-12-18 08:43, Dave Hodgkinson wrote:
>> On 18 Dec 2011, at 02:57, Yitzchak Scott-Thoennes wrote:
> 
>>> If all you need is to return results of a discrete bit of work to the
>>> parent, Parallel::ForkManager makes it trivial.
>> 
>> And the best way to return those results...?
> 
> Unless the exit code is information enough:
> I like to call it "merge results", and involve a database.

Ah, OK. I was hoping for some magic my 30 years of Unix had
overlooked :)


Re: Perl threads and libwww wierdness

2011-12-19 Thread Dave Hodgkinson

On 18 Dec 2011, at 21:06, Yitzchak Scott-Thoennes wrote:

> On Sat, Dec 17, 2011 at 11:43 PM, Dave Hodgkinson  wrote:
>> On 18 Dec 2011, at 02:57, Yitzchak Scott-Thoennes wrote:
>>> If all you need is to return results of a discrete bit of work to the
>>> parent, Parallel::ForkManager makes it trivial.
>> 
>> And the best way to return those results...?
> 
> $pfm->finish($exit_code, $data_structure_reference).
> 
> (Requires P::FM 0.7.6 or later.)

What method does this use under the hood?

Ah, Storable. Sweet.





Re: Perl threads and libwww wierdness

2011-12-20 Thread Nicholas Clark
On Fri, Dec 16, 2011 at 12:38:16PM +1100, Toby Wintermute wrote:
> 2011/12/16 Peter Vereshagin :

> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
> > production environments.
> 
> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
> but I thought things had improved since then..

Well, "they have" in that bugs have been fixed.
But the whole design, and the constraints imposed by the "API" of the core
mean that has got some fairly fundamental flaws that I don't think can
ever be fixed.

Not that Python (well, CPython) or Ruby (to the best of my knowledge) have
figured out how to retrofit proper concurrency of threads performing
computation within the VM.


> Total throughput is actually slightly faster than the threaded
> version, it uses quite a bit less memory, and critically, doesn't get
> any spurious errors.

I'm not surprised by the "less memory", but it's nice to get anecdotal
confirmation of this.


On Mon, Dec 19, 2011 at 06:31:53PM +1100, Toby Wintermute wrote:
> 2011/12/16 Peter Vereshagin :
> > Hello.
> >
> > 2011/12/16 12:38:16 +1100 Toby Wintermute  => To 
> > London.pm Perl M[ou]ngers :
> > TW> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
> > TW> > production environments.
> > TW>
> > TW> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
> > TW> but I thought things had improved since then..
> >
> > They did. Perl6 was released and it seems to have threads those can be 
> > recommended.
> > Perl5 have fork() that 'just works' and seems to be enough.
> 
> So, you're saying that threads under perl5 is forever going to be
> considered broken and not worth touching then? :(

I think he did. Strictly all I can say is "I can't see how to fix it"
and if I make such statements without any further qualification then it's
always also implied "and I've thought about this for a while"

[contrast with
 http://onefte.com/2011/11/04/taking-the-rough-estimation-oath/ ]

> Why do all the main distros ship a threading-enabled Perl? I note that
> Padre won't build without threads enabled either.

Remember that the default config is no threads, and no "shared perl library".

1: You seem to be assuming that distros actually know what they are doing
   when it comes to perl
   [IIRC Mandriva *stopped* defaulting to threads at some point while Rafael
Garcia-Suarez was working there. There. One example is proof :-)]
2: Distros don't like to ship more than copy of perl. They see that ithreads-
   enabled perl "supports" everything that the default config does. Likewise
   "shared perl library" means that anything embedding perl (eg mod_perl)
   takes less space. So they seem to tweak both of these from the defaults

Nicholas Clark


Re: Perl threads and libwww wierdness

2011-12-25 Thread Elizabeth Mattijsen
On Dec 19, 2011, at 9:50 AM, Yitzchak Scott-Thoennes wrote:
> On Sun, Dec 18, 2011 at 11:31 PM, Toby Wintermute  wrote:
>> 2011/12/16 Peter Vereshagin :
>>> 2011/12/16 12:38:16 +1100 Toby Wintermute  => To 
>>> London.pm Perl M[ou]ngers :
>>> TW> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended for
>>> TW> > production environments.
>>> TW>
>>> TW> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
>>> TW> but I thought things had improved since then..
>>> 
>>> They did. Perl6 was released and it seems to have threads those can be 
>>> recommended.
>>> Perl5 have fork() that 'just works' and seems to be enough.
>> 
>> So, you're saying that threads under perl5 is forever going to be
>> considered broken and not worth touching then? :(
>> 
>> Why do all the main distros ship a threading-enabled Perl? I note that
>> Padre won't build without threads enabled either.
> 
> Because threads are cool and all the cool languages have them?
> 
> My take is that because perl's ithreads are so expensive (in time
> and memory both to start them up and to share data), you are almost
> always better off designing your code to work with forks instead.
> I just don't see the compelling use case for threads, unless you are
> stuck in the threading mindset.

And if you are, you might want to look at the forks.pm module.  It provides the 
threads.pm API using fork().




Liz


Re: Perl threads and libwww wierdness

2012-01-18 Thread Peter Vereshagin
Hello.

2011/12/19 18:31:53 +1100 Toby Wintermute  => To London.pm 
Perl M[ou]ngers :
TW> > 2011/12/16 12:38:16 +1100 Toby Wintermute  => To 
London.pm Perl M[ou]ngers :
TW> > TW> > (i)Threaded perl5 ( 'use threads' ) doesn't seem to be recommended 
for
TW> > TW> > production environments.
TW> > TW>
TW> > TW> I know it certainly wasn't recommended back in the days of 5.6 or 5.8,
TW> > TW> but I thought things had improved since then..
TW> >
TW> > They did. Perl6 was released and it seems to have threads those can be 
recommended.
TW> > Perl5 have fork() that 'just works' and seems to be enough.
TW> 
TW> So, you're saying that threads under perl5 is forever going to be

Nothing in this world is forever. Chinese use to tell that  if the one will be
sitting near the river then that someone will see a floatiing dead corpse of a
his/her enemy floating nearby in the water. Some day.

TW> considered broken and not worth touching then? :(

Just don't 'use threads;', that's all :)

The only exception is the cygwin's use of native ms-windows threads behind its
forks emulation, including cygwin's perl.

TW> Why do all the main distros ship a threading-enabled Perl? I note that
TW> Padre won't build without threads enabled either.

Do you note why it won't?

TW> > TW> > Forks are the standard IPC method for perl5. Event-driven stuff 
(AE/EV) is a
TW> > TW> > more advanced way, but for 40 parallel requests the forks are just 
ok.
TW> > TW>
TW> > TW> I'm not sure IPC means what you think it means - or I'm
TW> > TW> misunderstanding you - It's Inter-process communication, right?
TW> > TW> It'd say that forking, threading, and event-driven, are all methods of
TW> > TW> doing simultaneous processing - but don't specify anything about IPC.
TW> >
TW> >
TW> > Right. I just use to avoid telling many words like 'method of doing
TW> > simultaneous processing' in favor of 'perlipc' in this case because:
TW> >
TW> >    $ man perlipc | grep fork | wc -l
TW> >          37
TW> >    $ man perlipc | grep thread | wc -l
TW> >           4
TW> >
TW> > and having several perl processes to communicate between is made just 
easier
TW> > by mean of fork().
TW> 
TW> How is fork() making it easier to communicate between processes? On
TW> the contrary, I say that using threads::shared makes it far more
TW> easier than fork() and rolling your own IPC.

I don't (yet) own any of the IPC, sorry. (=
Do you own a one for a comparison then? (=

TW> > TW> Which is the reason I was using threads - it's easy to do IPC between
TW> >
TW> > ... processes? (=

Right, processes. :)

TW> > TW> them by sharing access to some data structures.
TW> > TW> Whereas with forking you need to set up pipes, posix shared memory,
TW> > TW> sockets, or some other means.
TW> >
TW> > Memory-mapped file doesn't seem to be bad:
TW> >
TW> >    https://metacpan.org/module/IPC::MMA
TW> 
TW> As far as I can tell, that doesn't offer any way to perform
TW> synchronisation/condvars or suchlike - ie. where you can control the
TW> flow of one thread from another.

Sure, it just doesn't 'use threads;' (=
But I'm pretty sure 'perdoc perlipc' will offer you such a way for processes,
not for threads or suchlike - ie. semaphores

TW> > TW> In the end, I bit the bullet yesterday and did rewrite the code to use
TW> > TW> fork(), and then used Net::STOMP::Client with RabbitMQ to perform the
TW> > TW> IPC.
TW> >
TW> > Ouch. Isn't *MQ about to 'lose data under some circumstances' ?
TW> 
TW> No, not really. What makes you think that?
TW> It depends entirely on how you configure and use it - you can have

Because configuration can be a circumstance in some cases.

TW> messages you want it to make sure you really, really don't lose - or
TW> you can have messages that are more transient.

Ok but... why does your task makes *MQ ( sounds more about a FIFO-kind to me ) 
more preferable than, say, MemcacheDB ( means about key-value pairs storage to 
me )?

--
Peter Vereshagin  (http://vereshagin.org) pgp: A0E26627 


Re: Perl threads and libwww wierdness

2012-01-18 Thread Yitzchak Scott-Thoennes
2012/1/18 Peter Vereshagin :
> Just don't 'use threads;', that's all :)
>
> The only exception is the cygwin's use of native ms-windows threads behind its
> forks emulation, including cygwin's perl.

Unless things have changed since I last used it, cygwin uses real
processes for forks.  It's ActiveState's perl (and presumably the
mingw perl too) that emulates fork with threads.


Re: Perl threads and libwww wierdness (Leo Lapworth)

2011-12-14 Thread Ashley Hindmarsh
>
> --
>
> Message: 1
> Date: Wed, 14 Dec 2011 08:44:37 +
> From: Leo Lapworth 
> Subject: Re: Perl threads and libwww wierdness
> To: "London.pm Perl M[ou]ngers" 
> Message-ID:
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> I still think it's your webserver
>
> On 14 December 2011 01:16, Toby Wintermute  wrote:
> > I'm hitting some really odd behaviour, infrequently, with libwww and
> > mechanize under a highly-threaded Perl.
>
> > I have encountered a situation where I see unusual 404 errors - in
> > between 0.03% to 0.10% of requests.
> > Errors are randomly spaced on random pages
> >
> > The 404 errors are reported on the distant webserver as well, for URLs
> > that are definitely not 404. (as the identical URL is being requested
> > successfully many times in the same period).
>
> My logic would be, if the webserver is reporting intermittent 404's for a
> specific URL, then it's the webserver that's generating it.
>

In general, I'd agree with Leo, although there may be wrinkles in the HTTP
request you aren't seeing (cache headers).

Have you tried a detailed dump of each outbound request?

If the service owner can't give you a reason for the 404s on a valid URL,
then you might consider a proxy to get more detailed diagnostics on the
request/response.

  Ash

>
> > Scale: This is typically running around 40 threads, all going flat-out
> > on an 8-core system; issues show up whenever you get over ~6 threads
> > though.
>
> So the webserver can't cope with the traffic generated when you use
> more than 6 threads, so too many requests.
>
> > The only reason I don't think this is a problem with the network or
> > webserver is that the problems don't show up if I use fork() instead
> > of threads. (On otherwise identical code; and the same overall
> > throughput rates are reached. However the fork() version is just for
> > that bit of code for testing this; it misses some functionality.)
>
> Does the fork() submit the same number of requests in the same
> time period? - if it's less than your thread version then that
> would point to the webserver, not the Perl code.
>
> I'd look into if your webserver (you don't mention what software it
> is, apache/starman/IIS?) has some sort of "a, I can't cope,
> throw a 404" setting or bug.
>
> Good luck!
>
> Leo
>
>